diff --git a/PaddleCV/video/application/video_tag/README.md b/PaddleCV/video/application/video_tag/README.md new file mode 100644 index 0000000000000000000000000000000000000000..262d8d82cb8e83d591c5803de85d65cdc417e2a3 --- /dev/null +++ b/PaddleCV/video/application/video_tag/README.md @@ -0,0 +1,115 @@ +# VideoTag 飞桨大规模视频分类模型 + +--- +## 内容 + +- [模型简介](#模型简介) +- [安装说明](#安装说明) +- [数据准备](#数据准备) +- [模型推断](#模型推断) +- [模型微调](#模型微调) +- [参考论文](#参考论文) + + +## 模型简介 + +飞桨大规模视频分类模型VideoTag基于百度短视频业务千万级数据,支持3000个源于产业实践的实用标签,具有良好的泛化能力,非常适用于国内大规模(千万/亿/十亿级别)短视频分类场景的应用。VideoTag采用两阶段建模方式,即图像建模和序列学习。第一阶段,使用少量视频样本(十万级别)训练大规模视频特征提取模型(Extractor);第二阶段,使用千万级数据训练预测器(Predictor),最终实现在超大规模(千万/亿/十亿级别)短视频上产业应用,其原理示意如下图所示。 + +

+
+Temporal shift module +

+ +- 数据处理:视频是按特定顺序排列的一组图像的集合,这些图像也称为帧。视频分类任务需要先对短视频进行解码,然后再将输出的图像帧序列灌入到VideoTag中进行训练和预测。 + +- 图像建模:先从训练数据中,对每个类别均匀采样少量样本数据,构成十万量级的训练视频。然后使用TSN网络进行训练,提取所有视频帧的TSN模型分类层前一层的特征数据。在这个过程中,每一帧都被转化成相应的特征向量,一段视频被转化成一个特征序列。 + +- 序列学习:采用Attention clusters、LSTM和Nextvlad对特征序列进行建模,学习各个特征之间的组合方式,进一步提高模型准确率。由于序列学习相比于图像建模耗时更短,因此可以融合多个具有互补性的序列模型。示例代码仅使用Attention\_LSTM网络进行序列特征预测。 + +- 预测结果:融合多个模型结果实现视频分类,进一步提高分类准确率。 + + +## 安装说明 + +运行样例代码需要PaddlePaddle版本>= 1.7.0,请参考[安装文档](https://www.paddlepaddle.org.cn/documentation/docs/zh/1.7/install/index_cn.html)安装PaddlePaddle。 + +- 环境依赖: + +``` + CUDA >= 9.0 + cudnn >= 7.5 + OpenCV >= 4.1.0 : pip install opencv-python +``` + +## 数据准备 + +- 预训练权重下载:我们提供了[TSN](https://videotag.bj.bcebos.com/video_tag_tsn.tar)和[AttentionLSTM](https://videotag.bj.bcebos.com/video_tag_lstm.tar)预训练权重,请下载后解压,并将参数文件放在weights目录下,目录结构如下: + +``` +video_tag + ├──weights + ├── attention_lstm.pdmodel + ├── attention_lstm.pdopt + ├── attention_lstm.pdparams + ├── tsn.pdmodel + ├── tsn.pdopt + └── tsn.pdparams +``` + +- 示例视频下载:我们提供了[样例视频](https://videotag.bj.bcebos.com/mp4.tar)方便用户测试,请下载后解压,并将视频文件放置在video\_tag/data/mp4目录下,目录结构如下: + +``` +video_tag + ├──data + ├── mp4 + ├── 1.mp4 + └── 2.mp4 +``` + +- 目前支持的视频文件输入格式为:mp4、mkv和webm格式; + +- 模型会从输入的视频文件中均匀抽取300帧用于预测。对于较长的视频文件,建议先截取有效部分输入模型以提高预测速度。 + + +## 模型推断 + +模型推断的启动方式如下: + + bash run_TSN_LSTM.sh + +- 可修改video\_tag/data/tsn.list文件内容,指定待推断的文件路径列表; + +- 通过--filelist可指定输入list文件路径,默认为video\_tag/data/tsn.list; + +- 通过--extractor\_weights可指定特征提取器参数的存储路径,默认为video\_tag/weights/tsn; + +- 通过--predictor\_weights可指定预测器参数的存储路径,默认为video\_tag/weights/attention\_lstm; + +- 通过--use\_gpu参数可指定是否使用gpu进行推断,默认使用gpu。对于10s左右的短视频文件,gpu推断时间约为4s; + +- 通过--save\_dir可指定预测结果存储路径,默认为video\_tag/data/results,结果保存在json文件中,其格式为: + +``` + [file_path, + {"class_name": class_name1, "probability": probability1, "class_id": class_id1}, + {"class_name": class_name2, "probability": probability2, "class_id": class_id2}, + ... + ] +``` + +- 通过--label\_file可指定标签文件存储路径,默认为video\_tag/label\_3396.txt; + +- 模型相关配置写在video\_tag/configs目录下的yaml文件中。 + + +## 模型微调 + +- VideoTag中的TSN模型只输出视频特征,无需输出最终分类结果,fine-tune请参考PaddleCV视频库[TSN视频分类模型](../../models/tsn/README.md)请对应修改模型文件。 + +- VideoTag中的attention\_lstm模型只需要输入视频特征,无需音频特征输入,fine-tune请参考PaddleCV视频库[AttentionLSTM视频分类模型](../../models/attention_lstm/README.md)对应修改模型文件。 + +## 参考论文 + +- [Temporal Segment Networks: Towards Good Practices for Deep Action Recognition](https://arxiv.org/abs/1608.00859), Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc Van Gool + +- [Beyond Short Snippets: Deep Networks for Video Classification](https://arxiv.org/abs/1503.08909) Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, George Toderici diff --git a/PaddleCV/video/application/video_tag/configs/attention_lstm.yaml b/PaddleCV/video/application/video_tag/configs/attention_lstm.yaml new file mode 100644 index 0000000000000000000000000000000000000000..389b1c0f9adc4f37268071839a4645e7a4f29002 --- /dev/null +++ b/PaddleCV/video/application/video_tag/configs/attention_lstm.yaml @@ -0,0 +1,15 @@ +MODEL: + name: "AttentionLSTM" + dataset: None + bone_nework: None + drop_rate: 0.5 + feature_num: 2 + feature_names: ['rgb'] + feature_dims: [2048] + embedding_size: 1024 + lstm_size: 512 + num_classes: 3396 + topk: 20 + +INFER: + batch_size: 1 diff --git a/PaddleCV/video/application/video_tag/configs/tsn.yaml b/PaddleCV/video/application/video_tag/configs/tsn.yaml new file mode 100644 index 0000000000000000000000000000000000000000..1ec4dfbf24b0f1c0a85e0960d8a59afde20cfb9b --- /dev/null +++ b/PaddleCV/video/application/video_tag/configs/tsn.yaml @@ -0,0 +1,20 @@ +MODEL: + name: "TSN" + format: "mp4" + num_classes: 400 + seglen: 1 + image_mean: [0.485, 0.456, 0.406] + image_std: [0.229, 0.224, 0.225] + num_layers: 50 + topk: 5 + +INFER: + seg_num: 300 + short_size: 256 + target_size: 224 + num_reader_threads: 1 + buf_size: 1024 + batch_size: 1 + kinetics_labels: None + video_path: "" + filelist: "./data/tsn.list" diff --git a/PaddleCV/video/application/video_tag/data/tsn.list b/PaddleCV/video/application/video_tag/data/tsn.list new file mode 100644 index 0000000000000000000000000000000000000000..44f6e8e43acc88626ade03d0c8dae29610633d34 --- /dev/null +++ b/PaddleCV/video/application/video_tag/data/tsn.list @@ -0,0 +1 @@ +data/mp4/1.mp4 diff --git a/PaddleCV/video/application/video_tag/label_3396.txt b/PaddleCV/video/application/video_tag/label_3396.txt new file mode 100644 index 0000000000000000000000000000000000000000..bcda50c015c15d0f0cbd129a251e4a58b1fc93bd --- /dev/null +++ b/PaddleCV/video/application/video_tag/label_3396.txt @@ -0,0 +1,3396 @@ +胶合板 +坠楼 +空手道 +弹奏 +直升机 +罗盘 +健身 +羽毛球拍 +龙与地下城 +漆 +混合器 +学生 +安全气囊 +法庭 +游泳池 +潜艇 +穆斯林头巾 +奇葩 +绞狐大冒险 +飞行器 +演出 +喷枪 +萝莉 +暗黑血统 +彭彭丁满历险记 +出生 +嫩模 +流星雨 +超市 +StepMania +自动扶梯 +讲座 +缝纫机 +自助餐 +衣服 +翼装飞行 +手语 +可爱颂 +复合弓 +列车 +欧洲模拟卡车 +吃豆人 +队长 +僵尸 +猩红 +战争片 +通关攻略 +横梁 +机场 +引体向上 +暴力片 +橱柜 +卡车 +美人 +仙境传说 +格斗 +奇趣蛋 +健美 +新能源 +佳能 +电视 +喊麦 +信件 +双胞胎 +膳食补充剂 +胸部 +碟子 +女排 +地铁:最后的曙光 +牛肉 +激光照明 +毛巾 +面包店 +时空之轮 +泰迪 +吉他 +绿茶 +自驾游 +签名会 +酱 +抽屉 +山火 +T台 +喝醉 +马桶 +巴松管 +皇帝 +沙丘 +主播 +炖汤 +糖 +球球大作战 +彩票 +中暑 +雷达 +独木舟 +星座 +弓箭 +跑车 +大豆 +妖怪 +激光 +中秋节 +风景 +橡皮筋 +固体 +音乐会 +幽灵 +救生员 +彩虹 +政治 +眼线 +柴 +医疗 +购物中心 +舰载机 +空战 +服装 +钢模 +拖鞋 +教室 +羽毛球 +烤肉 +煎饼 +金星 +火箭 +婴儿车 +黑暗之魂 +夏目友人帐 +图像处理 +恐龙 +柔术 +剪刀 +冒险任务世界 +冰雹 +木工刨 +白金汉宫 +可丽饼 +绅士 +盖瑞模组 +滑板 +游戏网站 +套房 +动作教学 +DOTA +海盗传说 +小马慢跑 +怪物中学 +快闪 +冠军 +手风琴 +工具 +进击的巨人 +怀孕 +停车场 +舌钉 +自行车运动 +飞檐走壁 +滑雪板 +保健 +大蒜 +门 +咏春 +热火吉他手 +筷子 +饮料罐 +拳无虚发 +糗事 +豆腐 +动物园大亨 +佛兰肯斯坦 +动漫 +机长 +脱发 +石英 +医生 +母婴 +数码 +螳螂 +加仑 +核电站 +老鹰 +哑铃 +成语故事 +情景剧 +小提琴 +熊猫 +泥石流 +贴花 +合唱团 +质量效应 +东京食尸鬼 +流行音乐 +犁 +帆 +监拍 +城市 +液氮 +扳手 +卫星 +跳伞 +三维 +美味 +特种部队 +名模 +手帕 +瀑布 +教师 +风铃 +爱丽丝梦游仙境 +风光 +通用电气公司 +逗比 +豹子 +石油 +仙乐传说 +晴天 +皮革 +露台·天井 +实验室 +口琴 +驾车 +枕头 +鸡 +遥控器 +铁路运输 +瓦片 +原子弹 +偶像剧 +闯关 +西游记 +吉他音箱 +车速表 +甜品 +电源供应器 +人行道 +疲劳驾驶 +房车 +量子 +民工 +薄暮传说 +节日 +连连看 +遥控 +科学探索 +银河 +雨水沟 +小丑 +建造 +鹅 +地毯 +赛车俱乐部 +超级飞侠 +美女与野兽 +克兰娜德 +中央处理器 +儿童故事 +口罩 +警匪片 +美女直播 +海洋 +睡衣 +忍者 +烧伤 +裙子 +剪影 +生活大爆炸 +麦田怪圈 +勺子 +狮子王 +床戏 +导管 +冰雪奇缘 +彩泥 +货物 +驼铃 +牙膏 +高铁 +古风 +新娘 +深空传说 +鹰 +鹿 +铲车 +星际战甲 +怪物猎人 +转蛋 +香奈儿 +醉驾 +坦克世界 +新能源汽车 +幻想传奇 +纺织品 +超级英雄 +谍战片 +起重机 +钥匙·按键 +苹果商店 +河粉 +名侦探柯南 +蜂窝 +演唱会 +喷泉 +比基尼 +面粉 +日本食玩 +王子 +画画 +激情戏 +中国队 +帆船 +电商 +消防员 +美腿 +侏罗纪 +吃饭 +锯木机 +烤面包机 +土星 +珠子 +大头儿子 +穴位 +旅客 +演员 +短信 +擂台 +东方永夜抄 +龙之谷 +马路 +袜子 +神秘岛 +勋章 +斑马 +攻壳特工队 +激流回旋 +路易鬼屋 +飞盘 +汽车 +走秀 +异度之刃 +奥利奥 +相声 +房屋 +三国无双 +猫和老鼠 +高校 +鬼片 +维修 +巢 +煎蛋 +哪吒 +排球 +人体穿孔 +核武器 +明星 +水底 +水库 +海军陆战队 +景区 +陀螺战士 +战斗公主西娜 +教学 +火花塞 +收费站 +风力 +马里奥派对 +操作系统 +灼眼的夏娜 +古罗马 +哈士奇 +气象 +神魔之塔 +锁定:现代空战 +球接头娃娃 +神鬼寓言 +幽灵战车 +战争前线 +骡子 +出游 +早餐 +华为 +房间 +现代片 +海报 +游戏王 +咳嗽 +金丝雀 +音乐剧 +根 +灯泡 +星界边境 +视频教学 +剥 +钢铁 +星之卡比 +试驾 +车技 +剑 +树 +茄子 +轨道 +坠毁 +面团 +玩具屋 +拳击 +音乐中心 +行李 +长江 +花絮 +纯情罗曼史 +地精 +铁铲 +公园 +杠铃 +旅游团 +特斯拉线圈 +喷染术 +电子书 +猪猪侠 +骆驼 +假人挑战 +推杆 +图书馆 +洗澡 +耀西之岛 +武装突袭 +幼儿园 +印刷电路板 +头盔式相机 +金字塔 +双簧管 +养老院 +黎明杀机 +复活节兔子 +马棚 +枪杀 +二维码 +击杀 +刷子 +古筝 +财经 +武术 +影视周边 +游览车 +鳄鱼 +开箱 +水晶 +街头霸王 +恐怖袭击 +过生日 +陶瓷 +健身球 +慢镜头 +贝斯 +异形附身 +风扇 +时装秀 +海底 +奔驰小哥 +弹弓 +生化奇兵 +俱乐部 +人字拖 +推土机 +钞票 +救人 +派对 +土豆 +宿舍 +玉米 +乐动魔方 +国产剧 +柚子 +模子 +细菌 +背包 +婚礼 +菠菜 +遛狗 +东方红魔乡 +山口 +驴友 +偶像大师 +噬神者 +假面骑士 +瑞奇与叮当 +新郎 +坦克在线 +网吧 +酵母 +车手 +枪击 +杂志封面 +孩之宝 +猎人 +夜市 +黑岩射手 +王座 +雕塑粘土 +同人志 +浪客剑心 +车票 +重生娃娃 +驱逐舰 +反叛的鲁路修 +领带 +死亡空间 +幽默 +障碍技巧 +运输机 +铙钹 +条码 +采石场 +排骨 +壁橱 +高尔夫球 +恐怖主义 +圆号 +悠悠球 +科技奇趣 +陶轮 +石头 +枪战 +纸板 +斯诺克 +荒野大镖客 +吉祥物 +满月 +野蛮人柯南 +家电 +电子竞技 +但丁地狱 +天花板 +披萨 +车辆 +巨人 +风车 +高速公路 +婚房 +蛤蜊 +抢救 +兔子 +航展 +火山 +发动机 +装载机 +皮艇 +梳子 +维秘 +星际火狐 +嫦娥 +沼泽 +舞曲 +炒鸡蛋 +心灵杀手 +怪物 +中国风 +理发师 +悬崖 +铅笔 +博士 +海豚 +芥末 +磨刀 +卸妆 +黄牌 +魔法门 +飞行 +游泳 +羚羊 +自动售货机 +优惠券 +银行 +打车 +东北二人转 +演讲 +香槟酒 +油罐车 +海豹 +万智牌 +步枪 +造型师 +空间站 +大风 +鼻子 +外卖 +X战警 +田径 +外星人 +木材 +速度生活 +豪车 +鬼魂 +手榴弹 +海底隧道 +表演者 +木琴 +月饼 +活页乐谱 +红牛 +天才 +南瓜饼 +鸟 +离合器 +精灵复兴 +击倒 +农产品 +轰炸 +商家 +美貌 +狗粮 +绞盘 +虚构人物 +冰川 +怒之铁拳 +车祸 +星火 +陆战队 +太阳 +大学 +录音机 +全职猎人 +内衣 +赛车总动员 +同学会 +四重奏 +桨 +驾驶员 +健身房 +瓷器 +抢劫 +爆米花 +绿色 +蕾丝 +黑熊 +公主抱 +刀剑神域 +馒头 +圣诞礼物 +墙壁 +幼儿 +信用卡 +刀 +狂飙旧金山 +日历 +新生 +婚戒 +雪 +雨 +竹子 +美人鱼 +音乐键盘 +娃娃 +键盘 +动力火车 +骑兵·装甲兵 +立交桥 +散步 +成就 +荣誉勋章 +助攻 +沙滩 +蚯蚓 +动物 +汽车越野赛 +项链 +啤酒 +女装 +和尚 +乳清蛋白 +圣诞树 +手绘 +投篮 +大麦 +光头强 +工作会议 +苍蝇 +宝藏 +射击游戏 +粉笔 +杏仁 +碗 +神舟 +胭脂 +惊天动地 +马 +封面 +小学 +物联网 +沙子 +录音棚 +挖土机 +穿衣 +飞机 +大盘 +内涝 +恶魔 +鳄梨 +飞驰竞速 +西兰花 +实验 +录影机 +气球塔防 +跑酷 +交警 +熊 +桔梗 +解放军 +活动房屋 +相机 +数学 +特斯拉 +太空堡垒 +宅男女神 +安卓 +冰块 +鸡舍 +美妙天堂 +化石 +超时空要塞 +数字 +网球 +神秘海域 +艺考 +艺术节 +编织 +打字 +明星派 +二十一点 +护栏 +大海 +极光 +舞力全开 +广场 +神庙逃亡 +纽扣 +时装周 +西葫芦 +炊具和烤盘 +星巴克 +油炸 +划船 +创世纪 +摩托车越野赛 +星星 +金刚 +弹球 +美女 +三明治 +工艺 +冒险 +垃圾桶 +极限竞速 +加菲猫 +宝宝辅食 +首饰 +场地赛 +球 +幻想水浒 +生活剧 +希曼 +插图 +潜水 +秃鹫 +诺亚方舟 +少女 +比武 +糖果粉碎传奇 +拳皇 +墨水 +校园暴力 +引擎 +脱口秀 +路由·伐木 +牡蛎 +漂移 +熊出没 +校车 +牧羊人 +功夫 +植物大战僵尸 +朗诵 +娇妻 +镜框·画框 +百叶窗 +客流 +咖啡 +塑像 +生物学 +手电筒 +机器 +座位 +沙包·沙袋 +森林 +乐高主题公园 +视频制作 +充电器 +犬夜叉 +超级粉碎兄弟 +交通安全 +躲猫猫 +翼 +粘土动画 +山羊 +海王星 +导弹 +街头表演 +水獭 +访谈节目 +石榴 +讲解教学 +拥堵 +变形 +电饭煲 +星际公民 +猿 +头 +丝路传说 +极品飞车 +皮卡丘 +拍照 +化油器 +肥料 +鲨鱼 +星云 +冬奥会 +模拟器 +CD机 +中国梦 +捕食 +泰坦陨落 +白宫 +饺子 +光环 +火鸡 +男装 +火爆狂飙 +推钱机 +命令与征服 +大金刚国度 +古琴 +食堂 +消防站 +愤怒的小鸟 +护士 +母亲 +暗杀 +美妙旋律 +芦笋 +荷花 +弓猎 +超车 +松下 +宙斯 +生活记录 +公路 +模拟合成器 +时尚 +宾馆 +难民 +立体声扬声器 +旋转 +杯子 +模型 +坦克 +生食 +波西杰克逊 +气球 +峡谷 +锁 +粉蜡笔画 +铅笔盒 +收藏 +激光笔 +智能家居 +翻筋斗 +烤面包 +生化危机 +演奏 +百货公司 +屁股 +锯 +车站 +瓜 +极速前进 +篮子 +蹦极 +纸片马里奥 +秦时明月 +全面战争 +游乐园 +最终幻想 +水手 +水上乐园 +尾巴 +鸡蛋 +相声演员 +坚果 +硬盘驱动器 +吃货 +望远镜 +夹克 +僧侣 +山洪 +打斗 +仓库 +独奏 +毁灭战士 +牵手 +普乐路路轨道 +天鹅 +旅行社 +柔道 +景观 +古墓丽影 +蓝龙 +甜美 +拍手 +酒店 +膝盖 +歌曲 +滑翔伞 +小马宝莉 +修道院 +滑板公园 +旅馆 +云朵 +麦片 +灾区 +水槽 +卧室 +避暑 +小熊维尼 +棒球帽 +拖车 +四大名助 +铜管乐器 +沙画 +外太空 +模拟人生 +健身教练 +数字电子 +公寓 +乐迪 +枪战片 +便秘 +姑娘 +大宅门 +猪蹄 +山峰 +三国志大战 +灯 +锅炉 +火 +气球造型 +面部 +光标 +动作片 +上网本 +汽艇 +棉花 +雪橇 +热泵 +装修 +记者 +女警 +恐怖 +龙 +夜景 +民警 +算命 +手里剑 +夜晚 +笑傲江湖 +精灵 +炮弹 +表情包 +刮刮卡 +三轮车 +护目镜 +墙纸 +洗头 +红包 +星系 +运动鞋 +菌类 +冰 +拔牙 +腿 +肿瘤 +先锋 +开心农场 +迪士尼 +山体滑坡 +表格 +文物 +眉毛 +刷牙 +绝命毒师 +电子宠物 +咖啡机 +流苏花边 +素描 +超级跑跑 +搏击 +司机 +卡通 +灰姑娘 +晨练 +记号笔 +心脏 +大提琴 +卫生巾 +受灾 +任天堂 +珠宝 +英雄连 +溜冰场 +青岛大姨 +大灰熊 +骑车 +基督 +道具 +料理 +甜菜根 +鱼饵 +车床 +反曲弓 +影视 +网络直播 +车库 +波斯王子 +船厂 +捕食者 +青铜 +橄榄 +污点·着色 +咖啡屋 +水稻 +改装车 +小正太 +烧烤 +卡布奇诺 +蝴蝶结 +桥梁 +邮件 +数码宝贝 +手臂 +炉子 +学校 +霸王龙 +山 +客车 +焊接 +小车 +分裂细胞 +管道 +爱情剧 +摇滚名人堂 +游行 +完美世界 +开枪 +微波炉 +中学 +东方大头条 +香菇 +虾 +双眼皮 +椅子 +格雷少年 +相亲节目 +称重秤 +香精油 +小路 +压力清洗 +木头 +水彩画 +土豆泥 +电脑 +方舟 +乐高好友 +球体 +冷空气 +大闸蟹 +帽子 +涂料 +手提包 +战争 +水球 +汤 +西红柿 +唇妆 +商铺 +王者之剑 +腕表 +藤蔓 +钱包 +刀工 +平衡车 +奥斯卡金像奖 +抗日剧 +导游 +行星边际 +泡沫 +任务栏 +中药 +死侍 +小小大星球 +自行车 +签名 +胸肌 +太极 +儿童安全座椅 +口哨 +罗技 +休闲 +汉堡 +德军司令部 +变压器 +考拉 +动物之森 +手势 +竖琴 +椰子 +大炮 +医保 +杂技 +电影摄像机 +表演艺术 +话剧 +工作室 +黄河 +吸毒 +黄油 +无限试驾 +高空 +冬天 +酒 +洞穴 +甘薯 +流星体 +手表 +救护车 +金牌 +麦迪逊广场花园 +特技演员 +饼干 +垃圾车 +服装搭配 +出租车 +暴力 +女王 +盗墓 +手提箱 +丝巾 +化学反应 +海贼王 +淋浴 +选秀 +成型 +童话故事 +麦克风 +黑客 +无尽传说 +羊 +狙击手 +小轮车 +夺宝奇兵 +美食 +食品 +肥皂泡 +骑牛 +辫子 +重型设备 +战队 +制服诱惑 +法官 +蝎子 +小屋 +酒精灯 +青鬼 +马赛克 +南方公园 +无人机 +调酒师 +万万没想到 +粉底 +捕鱼 +初音未来 +毒贩 +矮人 +好莱坞 +六孔哨 +棺材 +猜拳 +潜水服 +搞笑 +火星 +盗窃 +DJ +沐浴类产品 +长颈鹿 +整蛊 +围攻 +教堂 +黑带 +浮桥 +单眼皮 +陷 +软件 +过山车大亨 +围巾 +幸存者 +情感剧 +洗剂 +拆除 +星际迷航 +浮子 +雪地 +安保 +黄金眼 +追尾 +岩石 +电视广告 +行窃 +会计 +鸭子 +VR显示器 +莱克斯卢瑟 +反恐精英 +蒸汽机 +球场 +游戏动漫 +玉米卷 +漫威传奇 +腾讯 +亚洲 +卫生间 +吸烟 +战争机器 +青蛙 +喜羊羊与灰太狼 +飞艇 +猎犬 +招式 +拉伸 +连帽衫 +欧美音乐 +恶魔岛 +拳击之夜 +车 +大型强子对撞机 +舰艇 +枫之谷 +真功夫 +轴 +飞碟 +生物 +魔兽争霸 +欧巴 +平底锅 +石膏 +钢琴 +海关 +剪纸 +坐垫 +镜子 +夏令营 +战争之人 +简历 +彩排 +船 +真空管 +邮轮 +法制节目 +皇室战争 +小龙斯派罗 +博览会 +舞蹈革命 +生活 +圣诞贺卡 +拥抱 +飞飞全明星 +驾考 +卫生纸 +上市 +果酱 +儿子 +教会 +艺术团 +刷卡 +信封 +军阀 +军队 +黑塔利亚 +玉米饼 +滑雪 +猕猴桃 +提拉米苏 +航天 +芭蕾 +狮子 +跑步机 +杀出重围 +忍者龙剑传 +碰撞 +使命召唤 +自拍 +火柴 +火车站 +枫树 +咖啡师 +解说 +狒狒 +终极格斗冠军 +魔法禁书目录 +消防车 +极限运动 +电脑机箱 +兵 +家畜 +墨镜 +演技派 +大长腿 +功夫片 +梯子 +夏日 +排箫 +法师 +急救 +福尔摩斯 +农场 +发型 +决战之夜 +太子妃 +华夫饼 +刺猬索尼克 +赌博 +磨砂机 +办公室 +器官 +毕业 +军训 +带子 +治愈 +船长 +砂浆 +最游记 +绿野仙踪 +炉石传说 +数字录像机 +清洁 +喷气艇 +刺猬 +恒温器 +透视装 +黑执事 +基金 +守望者 +ATM取款机 +干墙 +曲棍球 +双节棍 +明胶 +锤子 +婚宴 +街道 +甜饼怪 +上帝模式 +狂神国度 +烈火战车 +麻将 +X音素 +液压机 +水杯 +扭曲 +魔界战记 +车评 +独角兽 +特种兵 +诱饵 +活动 +面具 +九阴真经 +实况足球 +护肤品 +游戏工作室 +榴莲 +马戏团 +原油 +蚁类 +分娩 +钓鱼 +游戏手柄 +影评 +虚幻竞技场 +神枪手 +架线工 +无线遥控飞机 +轮滑 +排气系统 +水管 +电源 +星之海洋 +摄像机 +纪录片 +优雅 +闺蜜 +曼妥思 +作曲家 +锡罐 +骑行 +快递 +电影节 +车队 +犀牛 +肌肉 +纽约时代广场 +敌人 +英雄 +八路 +纹身 +留声机唱片 +家常菜 +影视原声 +撞车 +达人秀 +古玩 +吊坠手链 +旅游 +录节目 +竞技 +黄梅戏 +村民 +昆虫 +旅行车 +草原 +毛衣 +叉车 +决斗大师 +灌木 +手工 +神之浩劫 +广场舞 +工厂 +练习室 +智能硬件 +龙珠 +龙梦幻境 +模仿 +枪支 +加速处理单元 +皮卡 +踏板车 +卡丁车 +歹徒 +跳跃 +大屠杀 +阀 +霍比特人 +煤矿 +遥控车 +女仆 +眼镜 +遇难者 +足球 +英雄工厂 +种族 +武打 +皇牌空战 +曲奇饼 +蜡像 +衬衫 +平衡木 +火灾 +水果蜜饯 +孔雀 +头文字D +战国 +正手击打 +港台剧 +空中巴士 +部队 +挡风玻璃刮水器 +楼梯 +无人驾驶 +写作 +塑料袋 +灯塔 +徒步旅行 +埃菲尔铁塔 +快餐 +丛林 +怪兽 +灌篮高手 +导航 +台球 +裤子 +包子 +绘图仪 +宠物 +冲浪板 +厕所 +龙虾 +寿司 +海蜇 +赛车游戏 +下午茶 +跨栏 +图像扫描仪 +王者荣耀 +钢琴弹奏 +润肤膏 +真人快打 +橡皮泥 +二胡 +新封印传说 +衣服熨斗 +红烧肉 +除毛 +变脸 +泡菜 +酸奶 +中文 +甘蔗 +拉丁 +萨克斯 +鼓 +炸弹人 +壁炉 +球员 +角斗士 +轮缘 +病毒 +洛基 +科技数码 +梦想俱乐部 +私房菜 +平板 +灯光 +圆筒 +工人 +音乐 +灯具 +探险 +相亲 +传送门 +互联网 +喝 +鼠 +齿轮 +油脂 +旗 +糖霜酥皮 +光学错觉 +数字音频工作站 +击球 +截拳道 +指环王 +高达 +网球王子 +瘦腿 +神秘博士 +自行火炮 +向日葵 +纤维 +电视台 +羊肉 +飞行员 +电车 +按摩 +射箭 +欧洲杯 +戒指 +英雄传说 +棋牌 +魔术 +电动车 +体操 +毁灭公爵 +T恤 +宗教 +豚鼠 +精彩剪辑 +卡拉OK +护肤 +海盗 +染发 +名人采访 +锐化 +午夜俱乐部 +吃鱼 +飙车 +吸管 +肾脏 +焙烧 +跑步 +紫罗兰 +海岛奇兵 +东京喵喵 +阅兵 +偷窃 +奶茶 +辣条 +特战先锋 +蝙蝠侠 +孤岛危机 +魔法王国 +挖掘机 +U盘 +荧光棒 +图章 +女婴 +光晕 +礼品 +会议 +车展 +电音 +家具 +木雕 +台锯 +终极奇迹 +草坪 +模拟城市 +画眉 +淑女 +酒馆 +唇膏 +手机数码 +橄榄球 +锻造 +水疗 +音悦台 +反导系统 +动感 +第二人生 +星空 +园艺 +稻草人 +无头骑士 +盔甲 +舞会 +蛋 +高空抛物 +无敌浩克 +姜饼 +印刷 +帝国时代 +黄山 +鲁邦三世 +盲人 +蛇 +睡眠 +战舰世界 +蟑螂 +面包车 +缝纫针 +脂肪 +纸模型 +室内装潢 +恐怖分子 +客机 +欧美影视 +便利店 +核弹 +双面人 +厨师 +跑道 +计算机 +灾难片 +飞哥与小佛 +放牧 +文艺演出 +肖像 +红绿灯 +锥体 +喇叭 +赛道狂飙 +全家福 +麻辣烫 +包包 +身体护甲 +航空 +毒品 +天空 +针织 +魔杖 +猪肉 +砖 +松糕 +圣诞装饰 +轰炸机 +无尽的任务 +摇滚史密斯 +网页 +汽车照明系统 +小镇 +巫师 +月球 +硬汉 +机车 +面食 +手术 +海鲜 +玩具熊的五夜后宫 +巧克力 +手机 +Vox +画法 +莫妮卡的团伙 +大米 +全金属狂潮 +随声听 +旋律 +放生 +操场 +窗户 +恐怖喜剧 +大力水手 +惩罚者 +木工 +悬疑 +长方形 +木片 +电子电路 +查理与巧克力工厂 +不锈钢 +苍翼默示录 +盒子 +耐力赛 +保龄球 +海啸 +舰队收藏 +死亡岛 +歌手 +电话 +感染:幸存者故事 +真人秀 +恶魔城 +五佳球 +机械 +马里奥与路易吉 +饲养员 +滑水 +龙舟 +大理石 +港片 +葫芦娃 +武装分子 +奶油烤菜 +吓人 +斧头 +正义联盟 +超凡双生 +蜜蜂 +游艇 +头骨 +道路 +神奇四侠 +弓道 +呼啦圈 +拍客 +航空母舰 +狂热节拍 +宇宙 +美景 +健身队 +武侠 +武林高手 +测评 +薄樱鬼 +人物专访 +颈椎 +皮带 +少年泰坦 +黑色 +交响乐 +震荡 +火炉 +光盘 +喝水 +守望先锋 +烹饪 +装甲车 +棒球 +网游 +黄蜂 +安全带 +泰坦 +巴掌 +指南 +复活节彩蛋 +餐馆 +樱花 +溜冰鞋 +机甲战士 +耐克 +命运石之门 +装扮 +山水画 +耀斑 +贺卡 +日本团子 +月亮 +黑人 +科普 +钥匙扣 +甜瓜 +垃圾 +美食猎人 +头巾 +无线电遥控船 +骨牌 +单挑 +上古世纪 +覆盆子 +绳子 +海绵 +超模 +香肠 +奇观 +直线加速赛 +菜园 +雨伞 +十二生肖 +奶油 +汽车修理 +大号 +倒霉熊 +音乐节目 +唇彩 +几何冲刺 +视频游戏厅 +射击 +鬼屋 +手套 +驾驶 +青蛙军曹 +鞍 +港口 +彩灯 +广播公司 +摄影 +鞋 +我的世界 +大发 +马甲线 +模式·图案 +干衣机 +机器人战斗 +人工呼吸 +华尔兹 +水族馆 +国庆 +领奖 +巫师之怒 +火影忍者 +马克杯 +战鹰 +年会 +垂钓 +摩天大楼 +炸酱面 +企鹅 +整形 +睫毛 +暴走大事件 +教程 +钢铁侠 +日出 +国家公园 +戏剧 +折纸 +花 +说唱史诗战 +白娘子 +头盔 +威浮球 +热血无赖 +眼球 +香烟 +抗战片 +小鲜肉 +音响 +武功 +场地自行车 +稻田 +真侍魂 +海战英豪 +火焰之纹章 +婚纱摄影 +发布会 +损伤 +下水道 +雕刻 +制服 +延时摄影 +凯蒂猫 +截屏 +奇幻森林 +舞台剧 +雪糕 +飞车手罗德 +我想当爷们 +肉丸 +短号 +炮兵 +孩子 +搞怪 +军事 +对决 +战神 +菜花 +欧冠 +冰壶 +蓝莓 +帐篷 +幸运星 +化妆 +激战 +方便面 +旋转木马 +人物 +磁带 +恐怖片 +梦幻龙族 +牙齿 +海滩 +猛鬼街 +鲸 +唱片公司 +露营 +松饼 +安妮 +百乐门 +圣诞 +扬琴 +棚子 +调解 +发射 +体育 +通心粉 +热可可 +二次元 +迷人 +宇航员 +运钞车 +行车记录仪 +官员 +奥数 +玉米地 +音乐人 +彗星 +颁奖典礼 +表演 +粉丝 +军人 +堂吉诃德 +狙击枪 +减脂 +古装 +游戏机 +饥饿游戏 +撒旦 +邮票 +理发店 +网络主播 +身材火辣 +棒球 +兔八哥 +大巴车 +耳环 +数码产品 +游民星空 +泰拳 +配音秀 +机器人 +盛装舞步 +玩具人 +袋鼠 +酒吧 +蘑菇 +死亡边境 +世界杯 +驾驶舱 +海藻 +乐高 +艺术 +龙之信条 +开关 +武警 +日蚀·月蚀 +手机评测 +诛仙 +行李箱 +恐龙世界 +天宫 +滑板 +青贮饲料 +摄像头 +工程车 +阀门·龙头 +石工 +孤岛惊魂 +胫骨 +砸车 +迷你人形 +超级玛丽 +生活技巧 +武打片 +胡子 +苹果 +橙色 +灾害 +猫 +翅膀 +吵架 +唱诗班 +雷神 +扑克 +史酷比 +魔龙骑士 +人体 +拾音器 +圆圈·循环 +地狱 +运球 +游轮 +疯狂动物城 +战舰 +核反应堆 +雾霾 +版画 +真正的家庭主妇 +海龟 +烘培 +电容器 +核试验 +寒潮 +垂死之光 +橡木 +游乐场 +养生 +杀手 +魔法 +台阶·门廊 +倒塌 +法院 +硬币 +拳击比赛 +弩 +可爱 +笔记本 +花卉设计 +僵尸末日 +闹钟 +调制解调器 +狗窝 +萌妹 +部落战争 +聚会 +乐器 +劫匪 +腹语 +电动工具 +头发 +地下城与勇士 +卡牌 +卡片 +别墅 +地球冒险 +暴风雪 +瑜伽 +海狸 +安检 +绘画 +沙拉 +浴缸 +毛绒玩具 +海狮 +琵琶 +肯得基 +口红 +娱乐 +魔戒 +婴儿 +烫发器 +狂飙 +积水 +机动车 +奖 +椰奶 +芦荟 +刺客 +拖拉机 +蒙娜丽莎 +牛仔 +葡萄酒 +猴子 +潜水员 +盘式制动器 +比赛 +吸尘器 +豌豆 +拍摄现场 +帆布 +喜剧演员 +蜡笔小新 +香蕉 +全民健身 +牛排 +音响系统 +啦啦队 +街头采访 +视觉小说 +弹唱 +飞车 +装甲核心 +罐头 +哈利波特 +沉香 +举重 +纸 +拼图 +电视频道 +防护 +视频游戏 +家居 +平屋顶 +开车 +航拍 +特技 +杂货店 +拍卖 +薯条 +珍珠 +手指 +柔力球 +美少女战士 +游戏公司 +冰球 +天气预报 +充气船 +爆炒 +机油 +眼泪 +西区故事 +镶嵌 +仪表着陆系统 +鱼 +爆炸 +骑马 +礼服 +植物 +战地 +淘宝 +烟花 +求婚 +饮料 +蹲 +喜剧 +猎天使魔女 +潜行者 +船员 +汽油 +低音炮 +美甲 +无花果 +超级大金刚 +猩猩 +带锯 +国旗 +开幕式 +货运工具 +腹部 +泥潭 +秀逗魔导士 +交通 +小米 +钢琴家 +机票 +肉 +姜黄 +龙腾世纪 +杀戮空间 +婴儿吊带 +拿铁 +僵尸片 +孤儿院 +自爆 +马里奥赛车 +火锅 +冬季运动 +女巫 +大厦 +街头赛车 +快板 +驾校 +秀场 +侠盗猎车手 +杂志拍摄 +乌龟 +蜂蜜 +减肥操 +水上艇筏 +象 +播种 +单词 +偷车 +玻璃贴膜 +俄罗斯方块 +惊悚 +火车头托马斯 +净水器 +电影解说 +画家 +谷类 +机枪 +滑翔翼 +瓶子 +合唱 +超胆侠 +轮盘 +电气布线 +考古 +豆类 +集装箱 +异形 +洗碗机 +割草机 +茶 +计算器 +魔方 +宝莱坞 +辣妹 +军官 +牛人 +后备箱 +海边 +电磁线圈 +印度 +红酒 +食谱 +工地 +特技飞行 +家庭剧 +培乐多 +温泉 +钩针 +宫殿 +时装 +鹦鹉 +棕熊 +运动会 +空姐 +球星卡 +葱油饼 +洛奇 +女团 +老虎机 +记者会 +体育场 +票房 +无冬城 +浣熊 +洗衣服 +菜市场 +寂静岭 +肉汁 +大力士 +鼓棒 +金属加工 +壶铃 +德云社 +国际军事 +驾照 +面条 +手枪 +金条 +泰迪熊 +河马 +洗涤 +阁楼 +爆炸袭击 +桑拿 +踢打 +爱探险的朵拉 +葡萄园 +闪光 +妈妈 +骨头 +钓竿 +颜色 +摩托车头盔 +纱线 +驯鹿 +银魂 +独轮车 +虚拟玩家角色 +圣经 +毛笔字 +电影 +音乐影片 +西餐 +菠萝 +西湖 +清洁剂 +斗牛 +小红帽 +餐巾 +单杠 +地球 +爽肤水 +打印机 +吹风机 +记号笔 +小麦 +螺帽 +乐高都市 +白酒 +显卡 +都市 +画展 +光之美少女 +银行卡 +群星 +穿越火线 +古装剧 +单簧管 +网络 +洪水 +美容 +汤姆猫 +讲故事 +海底世界 +操作杆 +赛车方向盘 +倚天 +球赛 +海岸 +空调 +铁路 +怪物卡车大毁灭 +下巴 +票 +复仇者联盟 +新闻 +雪崩 +彩绘 +狂野飙车 +沙雕 +木偶 +轮椅 +文艺 +家电公司 +海岛 +苹果派 +降龙十八掌 +打结 +素食 +深渊传说 +骑士 +视频解说 +活塞 +小猪佩奇 +直播 +蟋蟀 +乘客 +英雄联盟 +大气污染 +硬石餐厅 +晶体管 +宝石 +奶酪 +图表 +鲜花 +背心 +反恐 +科学家 +种子 +喂食 +爪子 +火线精英 +体育用品 +照片 +军事武器 +直线 +电脑硬件 +开锁 +鼓手 +模型车 +航天器 +屏幕 +花生 +直排轮滑鞋 +军舰 +钻石 +橄榄油 +稻草 +蜡笔 +妆容 +杀手本能 +餐厅 +摔跤 +内裤 +蹦床 +樱兰高校男公关部 +跆拳道 +科幻 +豪宅 +停车 +冰淇淋 +钢盘·平底深锅 +大乱斗 +服装店 +千与千寻 +音标 +吉他英雄 +南瓜 +采访 +小吃 +漫画英雄 +最后生还者 +红薯 +镜之边缘 +燃脂 +葫芦丝 +篮球 +组装 +台球杆 +过滤器 +空翻 +壁画 +闪电 +海域 +红唇 +面试 +吊坠 +武侠剧 +睫毛膏 +香水 +舞蹈室 +资讯 +眼影 +军装 +躺骑车 +白色 +英魂之刃 +魔鬼 +饭团 +琴弦 +冰箱 +通灵王 +公交 +魔法之战 +泳装 +文本 +长号 +羊毛 +古诗 +马克思佩恩 +演习 +陀螺仪 +车牌 +静物写生 +木屋 +米饭 +萝卜 +高尔夫球 +散热器 +直播间 +星球大战 +黄金 +果汁 +疯狂橄榄球 +散打 +犰狳 +爱情故事 +决斗 +电动汽车 +缝纫 +餐饮 +魔兽世界 +设计师 +航班 +麻薯 +以撒的结合 +中提琴 +孢子 +说唱 +死神 +迷宫 +战斗 +警长 +手球 +睡袋 +镲片 +城堡 +性感 +酒精 +生化模式 +湖 +黑暗 +小小世界 +户外休闲 +球技 +同步带 +制动 +剧情片 +球鞋 +清纯 +聚餐 +刺绣 +减肥 +对唱 +睡美人 +儿童 +烤箱 +黄色 +干草 +神灵 +航空公司 +元素周期表 +电影院 +女神转生 +字典 +飞镖 +战锤 +失忆症 +死亡笔记 +亚马逊公司 +虐杀原形 +象棋 +虚幻引擎 +烧烤架 +奶粉 +悉尼歌剧院 +伐木 +草莓 +爆破 +忍者神龟 +银 +四轮车 +鬼泣 +娱乐八卦 +浴室 +鸡肉 +胡萝卜 +胎儿 +液体 +收割机 +铜 +玩具世界 +一字马 +飞船 +修剪器 +煤炭 +简笔图 +网剧 +小品 +洋葱 +便当 +百事 +蜘蛛 +警车 +马车 +尼姑 +河流 +斗牛士 +染色 +黄瓜 +跳水 +音乐大师课 +蜗牛 +钢笔 +故宫 +公益片 +渔船 +蓝色 +卷发器 +超级快递 +鞭炮 +珊瑚 +实战 +跳绳 +滑冰 +小行星 +翻车 +博物馆 +欧元 +哆啦A梦 +乐乐天使娃娃 +空难 +阴阳师 +辣椒 +青之驱魔师 +鸿雁 +SaGa +凝胶 +池塘 +节拍器 +亲子节目 +播放机 +打印 +歌迷 +荒野星球 +农业 +地震 +时政 +吴哥窟 +拉面 +音乐节 +甜甜圈 +藤球 +灾难意外 +骑马与砍杀 +柑橘 +不明飞行物 +软管 +相册 +触摸屏 +飞行表演 +圣杯神器 +紫色 +笛子 +存储卡 +鸽赛 +蔬菜 +山地自行车 +哑剧大师 +双簧 +长椅 +松弛熊 +官兵 +巧克力 +动画 +侦探 +溜冰 +拉链 +警察局 +工程师 +分屏 +牧师 +球拍 +馅饼 +马展 +蜡烛 +游戏 +舌头 +增压器 +泰拉瑞亚 +三国 +污染 +管带夹 +丫鬟 +歌剧魅影 +温室 +八卦 +晚会 +多米诺骨牌 +西瓜 +无主之地 +薯片 +降落伞 +家具装饰 +螃蟹 +模拟山羊 +麦当劳 +传感器 +粉扑 +太阳能 +裁判 +保卫萝卜 +地铁 +松鼠 +猫女 +课堂 +木星 +耳机 +耳朵 +医学 +尼尔机械纪元 +驾驶证 +婚车 +砂锅 +死海 +海绵宝宝 +模拟农场 +警官 +调酒 +龙战士 +动车 +老鼠 +辛普森一家 +蜥蜴 +和服 +女生 +影视混剪 +长毛绒 +广告牌 +撒娇 +炒锅 +萌宝 +自然 +指甲油 +灰泥 +火腿 +桌子 +月姬格斗 +塑料 +大脑 +接线盒 +攀岩 +水果忍者 +货币 +秋千 +销售 +卷轴 +化妆品 +包裹 +斑马线 +面包超人 +蛋糕 +肉桂 +寺庙 +书法 +团队套牛 +仙人掌 +餐饮 +火箭炮 +视频直播 +鬼娃回魂 +画线骑士 +宜家 +春晚 +步行 +日落 +袋子 +击剑 +理发 +地下室 +斗地主 +打针 +喝酒 +喷漆 +柯南时代 +锦鲤 +凝乳 +杀戮地带 +恶霸鲁尼 +奖牌 +猫头鹰 +赛道 +战士 +美照 +购物 +蝴蝶 +字母表 +客厅 +乌鸦 +唢呐 +反串 +潘多拉 +监控 +烤鸭 +明星大乱斗 +葡萄 +飓风 +病人 +吊车 +蝙蝠 +伪装 +益智玩具 +舞蹈 +合金装备 +跳楼 +勇者斗恶龙 +油 +网站 +厨师机 +凯恩的遗产 +钱 +食材 +外交部 +酒厂 +显示器 +主持 +羽绒服 +牛仔布 +车模 +盐 +芝麻 +痘痘 +股票 +微笑 +菜单 +地板 +烤鸡 +自动唱机 +雪貂 +涡轮 +扎染 +歌剧 +变形金刚 +失火 +门票 +雪山 +风筝 +长袍·礼服 +书柜 +家庭教师 +死亡之屋 +DarkOrbit +粮食 +公益活动 +藏獒 +渔民 +下一站巨星 +彩虹手环 +苦瓜 +冲浪 +卷心菜 +珠饰 +西贡小姐 +地铁酷跑 +训练营 +运输 +磁铁 +健康 +床垫 +摇摆 +街头恶搞 +糕点 +拳王 +肋骨 +猫 +曲艺 +加油站 +凉宫春日 +妖怪手表 +动力伞 +墓地 +工程 +民房 +胶片 +色带 +主教 +樱桃小丸子 +鸡翅 +轮子 +牛 +邻里 +萌 +音乐制作 +洛克人 +芒果 +地图 +劈木机 +勇士 +火锅店 +电梯 +吻 +弹球盘 +三角形 +粘土 +鸡尾酒 +慈善 +天天酷跑 +唱片骑师 +结婚 +家庭 +手机壳 +航线 +职业摔跤 +肥皂 +竞技场 +丧钟 +摩天轮 +天使 +台面 +外汇市场 +肉搏 +求生之路 +铜牌 +泡面 +流亡黯道 +灯笼 +谜题 +婴儿室 +捕猎 +尿布袋 +鱼鹰 +雪犁 +方块世界 +斑鸠 +建筑 +电视剧 +堆肥 +细胞 +邪恶力量 +零食 +湾岸竞速 +太鼓达人 +赛车 +金枪鱼 +司令 +皮肤 +马拉松 +末日 +垒球 +涂鸦 +充气城堡 +十字架 +食疗 +早教 +速叠杯 +纸牌 +披肩 +躲避球 +柠檬 +打牌 +抗战 +绕口令 +美容院 +惠普 +情感节目 +永恒之塔 +电脑鼠标 +虚拟现实 +特警 +吊床 +货车 +飞绑 +可乐 +运动 +双重国度 +多功能工具 +妹子 +农村 +眼睛 +干冰 +果冻 +相声小品 +电线杆 +战友 +影视配音 +孤岛生存大乱斗 +奥运 +沃尔玛 +太空 +星际之门 +装饰 +灰色 +樱桃 +电锯 +手铃 +科幻片 +身份证 +古墓 +乒乓 +溪流 +手链 +野外生存 +天线 +玻璃 +营地 +庆典 +玩具 +袭击事件 +美术 +橡皮 +加农 +镜头 +探测器 +洗发精 +彩虹岛 +武器 +装置艺术 +葱 +护理 +命运 +仓鼠 +碎石 +青蛙科密特 +螺旋桨 +七日杀 +整容 +行星 +小宝宝 +科技 +台风 +勇者前线 +皇家国教骑士团 +狂欢节 +热狗 +捉迷藏 +弦乐琴 +叶子 +床 +彼得潘 +写真 +托儿所 +设备 +冰桶挑战 +萌物 +变色龙 +花瓣 +伴郎 +打戏 +画报 +罪恶装备 +漫画 +瘫痪 +飞机失事 +奇闻趣事 +大选 +花瓶 +钢之炼金术师 +杂志 +鼠型车 +教育 +旺达与巨像 +插花 +城堡破坏者 +泵 +混音带 +字体 +超人 +倒计时 +恶作剧 +鹌鹑 +吸血鬼 +小朋友 +颤音琴 +符号 +调音台 +梦幻之星 +橘子 +奶昔 +面糊 +冬不拉 +北斗神拳 +越野 +灭火器 +水果 +婚纱 +上古卷轴 +007 +暮光之城 +蜘蛛侠 +冰沙 +下坡 +毡 +警察 +超市特工 +外套 +汉服 +女童 +筏流 +花园 +布丁 +花圈 +生菜 +新年 +清雪机 +气雾喷雾器 +暮蝉悲鸣时 +公主 +显微镜 +秋天 +模特 +收藏品 +咖喱 +空气净化器 +漫威宇宙 +混凝土 +育儿 +电子琴 +遮瑕膏 +火车 +芭比娃娃 +爵士 +音箱 +黑洞 +积木 +剑球 +奶爸 +监管 +美国队长 +爆笑 +闪电 +降世神通 +祷告 +家禽 +穿越时空 +分裂 +轮胎 +水坝 +索尼 +战斗机 +恶搞路人 +拍戏 +电池 +爆胎 +光棍 +俯卧撑 +摩斯 +饮用水 +狂热 +阅读器 +训练 +奥特曼 +王国之心 +学车 +快递员 +住宅 +袋狼大冒险 +悟空 +面包 +雷曼疯狂兔子 +杀手 +赛马 +啄木鸟伍迪 +国务院 +拖把 +壁虎 +铁拳 +高跟鞋 +动物园 +唱片 +金鹰节 +棒球公园 +宠物小精灵 +手游 +部落冲突 +兽人 +魔术师 +谷仓 +圣剑传说 +商场 +起火 +内饰 +暴龙 +鲸 +上课 +油画 +剧本 +武士 +村庄 +脖子 +卷饼 +蚊子 +狩猎 +保健品 +红毯 +总统 +塔罗牌 +偶像活动 +涂层 +合金弹头 +黑白 +沙漠 +白头鹰 +芝士 +宅男 +战利品 +军营 +围棋 +洗衣店 +教育部 +模糊 +国画 +菲比娃娃 +雕塑 +施工 +书呆子 +冬季 +F-Zero +核桃 +狱警 +游戏人物 +旗袍 +笑话 +衣柜 +综艺 +迫击炮 +梨 +圣斗士 +媒体 +辩论 +健美操 +速降 +男团 +杀人 +圣诞老人 +圆顶 +海豚音 +特技表演 +耙 +探索 +僵尸围城 +银河战士 +长城 +雪人 +作画 +狼 +星际争霸 +立方体 +武装·装备 +被子 +自行车赛 +吃东西 +金属 +交易 +铲屎官 +培根 +档案 +飞去来器 +歌舞表演 +报纸 +仙女 +舞蹈中心 +亚瑟王传奇 +浏览器 +钟 +狗 +露营车 +艺术品 +洗衣机 +睡姿 +打野 +西装 +管风琴 +半机械人 +U型场地 +光 +鸽子 +窗帘 +练习生 +刺客信条 +黑道圣徒 +农民 +煤气灶 +播放器 +塞尔达传说 +消防 +黄铜 +胶带 +挡泥板 +越战越勇 +糖浆 +武装部队 +录像带 +倒车 +牛奶 +冰棍 +阳台 +饮品 +番茄 +灵异事件 +屋顶 +角色扮演 +大富翁 +饿狼传说 +玫瑰 +猪 +海马 +防汛抗洪 +水井 +书 +土地 +村长 +权力的游戏 +东方妖妖梦 +半条命 +国家队 +木瓜 +绿箭 +滑翔 +视频艺术 +人猿泰山 +国防部 +报警装置 +吉尼斯 +厢型布景 +突袭 +狐狸 +倒立 +搅拌机 +腹肌 +飙酷车神 +电子键盘 +惩罚 +失落的星球 +乐队 +丝绸 +冲突 +豆芽 +交通工具 +滑翔机 +亲子 +拳击手 +少儿 +厨房 +花栗鼠 +楼市 +卡通城 +夜店 +洗车 +广告 +饭店 +合气道 +雪地车 +留声机 +全民枪战 +毛皮 +迷你四驱车 +钻头 +生活常识 +少林 +校园 +拔河 +事故 +菊花 +小蛮腰 +过山车 +鸡腿 +暗黑破坏神 +炸鸡 +排版 +拼贴画 +制造业 +艺人 +选美 +猛兽 +英语 +手 +酥皮 +运动员 +卡士达酱 +内衣秀 +护照 +民航 +土匪 +监狱 +靴子 +积雪草 +沙发 +加勒比海盗 +咱们穿越吧 +极度恐慌 +拉力赛 +背部 +伴娘 +投影机 +面膜 +水 +玉·翡翠 +易拉罐 +度假村 +益智 +吻戏 +丈夫 +吊扇 +模具 +水泥 +火柴人 +公安部 +泥土 +地铁站 +打火机 +小小宠物店 +橙子 +子弹 +猴子岛 +闪电十一人 +雪碧 +指甲 +摩托车 +摄影师 +角色 +电人 +老虎 +音乐合奏 +塑料瓶 +发带 +标签·商标 +肉排 +桃子 +指板 +狼人 +分解动作 +读书 +志愿者 +灵魂能力 +星际宝贝 diff --git a/PaddleCV/video/application/video_tag/metrics/__init__.py b/PaddleCV/video/application/video_tag/metrics/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..0d1df762bdf3d3b920fc1e00d15a3a2ecdcdbe55 --- /dev/null +++ b/PaddleCV/video/application/video_tag/metrics/__init__.py @@ -0,0 +1 @@ +from .metrics_util import get_metrics diff --git a/PaddleCV/video/application/video_tag/metrics/metrics_util.py b/PaddleCV/video/application/video_tag/metrics/metrics_util.py new file mode 100644 index 0000000000000000000000000000000000000000..730562e8055547ec6afb94790a6f414b1350f7e1 --- /dev/null +++ b/PaddleCV/video/application/video_tag/metrics/metrics_util.py @@ -0,0 +1,169 @@ +# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +from __future__ import absolute_import +from __future__ import unicode_literals +from __future__ import print_function +from __future__ import division + +import os +import io +import logging + +import numpy as np +import json +from metrics.youtube8m import eval_util as youtube8m_metrics + +logger = logging.getLogger(__name__) + + +class Metrics(object): + def __init__(self, name, mode, metrics_args): + """Not implemented""" + pass + + def calculate_and_log_out(self, fetch_list, info=''): + """Not implemented""" + pass + + def accumulate(self, fetch_list, info=''): + """Not implemented""" + pass + + def finalize_and_log_out(self, info='', savedir='./'): + """Not implemented""" + pass + + def reset(self): + """Not implemented""" + pass + + +class Youtube8mMetrics(Metrics): + def __init__(self, name, mode, metrics_args): + self.name = name + self.mode = mode + self.num_classes = metrics_args['MODEL']['num_classes'] + self.topk = metrics_args['MODEL']['topk'] + self.calculator = youtube8m_metrics.EvaluationMetrics(self.num_classes, + self.topk) + if self.mode == 'infer': + self.infer_results = [] + + def calculate_and_log_out(self, fetch_list, info=''): + loss = np.mean(np.array(fetch_list[0])) + pred = np.array(fetch_list[1]) + label = np.array(fetch_list[2]) + hit_at_one = youtube8m_metrics.calculate_hit_at_one(pred, label) + perr = youtube8m_metrics.calculate_precision_at_equal_recall_rate(pred, + label) + gap = youtube8m_metrics.calculate_gap(pred, label) + logger.info(info + ' , loss = {0}, Hit@1 = {1}, PERR = {2}, GAP = {3}'.format(\ + '%.6f' % loss, '%.2f' % hit_at_one, '%.2f' % perr, '%.2f' % gap)) + + def accumulate(self, fetch_list, info=''): + if self.mode == 'infer': + predictions = np.array(fetch_list[0]) + video_id = fetch_list[1] + for i in range(len(predictions)): + topk_inds = predictions[i].argsort()[0 - self.topk:] + topk_inds = topk_inds[::-1] + preds = predictions[i][topk_inds] + self.infer_results.append( + (video_id[i], topk_inds.tolist(), preds.tolist())) + else: + loss = np.array(fetch_list[0]) + pred = np.array(fetch_list[1]) + label = np.array(fetch_list[2]) + self.calculator.accumulate(loss, pred, label) + + def finalize_and_log_out(self, + info='', + savedir='./data/results', + label_file='./label_3396.txt'): + if self.mode == 'infer': + for index, item in enumerate(self.infer_results): + video_id = item[0] + logger.info( + '========video_id [ {} ] , topk({}) preds: ========\n'. + format(video_id, self.topk)) + + f = io.open(label_file, "r", encoding="utf-8") + fl = f.readlines() + res_list = [] + res_list.append(video_id) + for i in range(len(item[1])): + class_id = item[1][i] + class_prob = item[2][i] + class_name = fl[class_id].split('\n')[0] + print('class_id: {},'.format(class_id), 'class_name:', + class_name, + ', probability: {} \n'.format(class_prob)) + save_dict = { + "'class_id": class_id, + "class_name": class_name, + "probability": class_prob + } + res_list.append(save_dict) + + # save infer result into output dir + with io.open( + os.path.join(savedir, 'result' + str(index) + '.json'), + 'w', + encoding='utf-8') as f: + f.write(json.dumps(res_list, f, ensure_ascii=False)) + else: + epoch_info_dict = self.calculator.get() + logger.info(info + '\tavg_hit_at_one: {0},\tavg_perr: {1},\tavg_loss :{2},\taps: {3},\tgap:{4}'\ + .format(epoch_info_dict['avg_hit_at_one'], epoch_info_dict['avg_perr'], \ + epoch_info_dict['avg_loss'], epoch_info_dict['aps'], epoch_info_dict['gap'])) + + def reset(self): + self.calculator.clear() + if self.mode == 'infer': + self.infer_results = [] + + +class MetricsZoo(object): + def __init__(self): + self.metrics_zoo = {} + + def regist(self, name, metrics): + assert metrics.__base__ == Metrics, "Unknow model type {}".format( + type(metrics)) + self.metrics_zoo[name] = metrics + + def get(self, name, mode, cfg): + for k, v in self.metrics_zoo.items(): + if k == name: + return v(name, mode, cfg) + raise MetricsNotFoundError(name, self.metrics_zoo.keys()) + + +# singleton metrics_zoo +metrics_zoo = MetricsZoo() + + +def regist_metrics(name, metrics): + metrics_zoo.regist(name, metrics) + + +def get_metrics(name, mode, cfg): + return metrics_zoo.get(name, mode, cfg) + + +# sort by alphabet +regist_metrics("ATTENTIONCLUSTER", Youtube8mMetrics) +regist_metrics("ATTENTIONLSTM", Youtube8mMetrics) +regist_metrics("NEXTVLAD", Youtube8mMetrics) diff --git a/PaddleCV/video/application/video_tag/metrics/youtube8m/__init__.py b/PaddleCV/video/application/video_tag/metrics/youtube8m/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/PaddleCV/video/application/video_tag/metrics/youtube8m/average_precision_calculator.py b/PaddleCV/video/application/video_tag/metrics/youtube8m/average_precision_calculator.py new file mode 100644 index 0000000000000000000000000000000000000000..9bad69dd0aff1906e3548fb0322203f0bc5b408d --- /dev/null +++ b/PaddleCV/video/application/video_tag/metrics/youtube8m/average_precision_calculator.py @@ -0,0 +1,275 @@ +# Copyright 2016 Google Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS-IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""Calculate or keep track of the interpolated average precision. + +It provides an interface for calculating interpolated average precision for an +entire list or the top-n ranked items. For the definition of the +(non-)interpolated average precision: +http://trec.nist.gov/pubs/trec15/appendices/CE.MEASURES06.pdf + +Example usages: +1) Use it as a static function call to directly calculate average precision for +a short ranked list in the memory. + +``` +import random + +p = np.array([random.random() for _ in xrange(10)]) +a = np.array([random.choice([0, 1]) for _ in xrange(10)]) + +ap = average_precision_calculator.AveragePrecisionCalculator.ap(p, a) +``` + +2) Use it as an object for long ranked list that cannot be stored in memory or +the case where partial predictions can be observed at a time (Tensorflow +predictions). In this case, we first call the function accumulate many times +to process parts of the ranked list. After processing all the parts, we call +peek_interpolated_ap_at_n. +``` +p1 = np.array([random.random() for _ in xrange(5)]) +a1 = np.array([random.choice([0, 1]) for _ in xrange(5)]) +p2 = np.array([random.random() for _ in xrange(5)]) +a2 = np.array([random.choice([0, 1]) for _ in xrange(5)]) + +# interpolated average precision at 10 using 1000 break points +calculator = average_precision_calculator.AveragePrecisionCalculator(10) +calculator.accumulate(p1, a1) +calculator.accumulate(p2, a2) +ap3 = calculator.peek_ap_at_n() +``` +""" + +import heapq +import random +import numbers + +import numpy + + +class AveragePrecisionCalculator(object): + """Calculate the average precision and average precision at n.""" + + def __init__(self, top_n=None): + """Construct an AveragePrecisionCalculator to calculate average precision. + + This class is used to calculate the average precision for a single label. + + Args: + top_n: A positive Integer specifying the average precision at n, or + None to use all provided data points. + + Raises: + ValueError: An error occurred when the top_n is not a positive integer. + """ + if not ((isinstance(top_n, int) and top_n >= 0) or top_n is None): + raise ValueError("top_n must be a positive integer or None.") + + self._top_n = top_n # average precision at n + self._total_positives = 0 # total number of positives have seen + self._heap = [] # max heap of (prediction, actual) + + @property + def heap_size(self): + """Gets the heap size maintained in the class.""" + return len(self._heap) + + @property + def num_accumulated_positives(self): + """Gets the number of positive samples that have been accumulated.""" + return self._total_positives + + def accumulate(self, predictions, actuals, num_positives=None): + """Accumulate the predictions and their ground truth labels. + + After the function call, we may call peek_ap_at_n to actually calculate + the average precision. + Note predictions and actuals must have the same shape. + + Args: + predictions: a list storing the prediction scores. + actuals: a list storing the ground truth labels. Any value + larger than 0 will be treated as positives, otherwise as negatives. + num_positives = If the 'predictions' and 'actuals' inputs aren't complete, + then it's possible some true positives were missed in them. In that case, + you can provide 'num_positives' in order to accurately track recall. + + Raises: + ValueError: An error occurred when the format of the input is not the + numpy 1-D array or the shape of predictions and actuals does not match. + """ + if len(predictions) != len(actuals): + raise ValueError( + "the shape of predictions and actuals does not match.") + + if not num_positives is None: + if not isinstance(num_positives, + numbers.Number) or num_positives < 0: + raise ValueError( + "'num_positives' was provided but it wan't a nonzero number." + ) + + if not num_positives is None: + self._total_positives += num_positives + else: + self._total_positives += numpy.size(numpy.where(actuals > 0)) + topk = self._top_n + heap = self._heap + + for i in range(numpy.size(predictions)): + if topk is None or len(heap) < topk: + heapq.heappush(heap, (predictions[i], actuals[i])) + else: + if predictions[i] > heap[0][0]: # heap[0] is the smallest + heapq.heappop(heap) + heapq.heappush(heap, (predictions[i], actuals[i])) + + def clear(self): + """Clear the accumulated predictions.""" + self._heap = [] + self._total_positives = 0 + + def peek_ap_at_n(self): + """Peek the non-interpolated average precision at n. + + Returns: + The non-interpolated average precision at n (default 0). + If n is larger than the length of the ranked list, + the average precision will be returned. + """ + if self.heap_size <= 0: + return 0 + predlists = numpy.array(list(zip(*self._heap))) + + ap = self.ap_at_n( + predlists[0], + predlists[1], + n=self._top_n, + total_num_positives=self._total_positives) + return ap + + @staticmethod + def ap(predictions, actuals): + """Calculate the non-interpolated average precision. + + Args: + predictions: a numpy 1-D array storing the sparse prediction scores. + actuals: a numpy 1-D array storing the ground truth labels. Any value + larger than 0 will be treated as positives, otherwise as negatives. + + Returns: + The non-interpolated average precision at n. + If n is larger than the length of the ranked list, + the average precision will be returned. + + Raises: + ValueError: An error occurred when the format of the input is not the + numpy 1-D array or the shape of predictions and actuals does not match. + """ + return AveragePrecisionCalculator.ap_at_n(predictions, actuals, n=None) + + @staticmethod + def ap_at_n(predictions, actuals, n=20, total_num_positives=None): + """Calculate the non-interpolated average precision. + + Args: + predictions: a numpy 1-D array storing the sparse prediction scores. + actuals: a numpy 1-D array storing the ground truth labels. Any value + larger than 0 will be treated as positives, otherwise as negatives. + n: the top n items to be considered in ap@n. + total_num_positives : (optionally) you can specify the number of total + positive + in the list. If specified, it will be used in calculation. + + Returns: + The non-interpolated average precision at n. + If n is larger than the length of the ranked list, + the average precision will be returned. + + Raises: + ValueError: An error occurred when + 1) the format of the input is not the numpy 1-D array; + 2) the shape of predictions and actuals does not match; + 3) the input n is not a positive integer. + """ + if len(predictions) != len(actuals): + raise ValueError( + "the shape of predictions and actuals does not match.") + + if n is not None: + if not isinstance(n, int) or n <= 0: + raise ValueError("n must be 'None' or a positive integer." + " It was '%s'." % n) + + ap = 0.0 + + predictions = numpy.array(predictions) + actuals = numpy.array(actuals) + + # add a shuffler to avoid overestimating the ap + predictions, actuals = AveragePrecisionCalculator._shuffle(predictions, + actuals) + sortidx = sorted( + range(len(predictions)), key=lambda k: predictions[k], reverse=True) + + if total_num_positives is None: + numpos = numpy.size(numpy.where(actuals > 0)) + else: + numpos = total_num_positives + + if numpos == 0: + return 0 + + if n is not None: + numpos = min(numpos, n) + delta_recall = 1.0 / numpos + poscount = 0.0 + + # calculate the ap + r = len(sortidx) + if n is not None: + r = min(r, n) + for i in range(r): + if actuals[sortidx[i]] > 0: + poscount += 1 + ap += poscount / (i + 1) * delta_recall + return ap + + @staticmethod + def _shuffle(predictions, actuals): + random.seed(0) + suffidx = random.sample(range(len(predictions)), len(predictions)) + predictions = predictions[suffidx] + actuals = actuals[suffidx] + return predictions, actuals + + @staticmethod + def _zero_one_normalize(predictions, epsilon=1e-7): + """Normalize the predictions to the range between 0.0 and 1.0. + + For some predictions like SVM predictions, we need to normalize them before + calculate the interpolated average precision. The normalization will not + change the rank in the original list and thus won't change the average + precision. + + Args: + predictions: a numpy 1-D array storing the sparse prediction scores. + epsilon: a small constant to avoid denominator being zero. + + Returns: + The normalized prediction. + """ + denominator = numpy.max(predictions) - numpy.min(predictions) + ret = (predictions - numpy.min(predictions)) / numpy.max(denominator, + epsilon) + return ret diff --git a/PaddleCV/video/application/video_tag/metrics/youtube8m/eval_util.py b/PaddleCV/video/application/video_tag/metrics/youtube8m/eval_util.py new file mode 100644 index 0000000000000000000000000000000000000000..f7742236f1176073eae84fdc7c3a3a1a2e294fe0 --- /dev/null +++ b/PaddleCV/video/application/video_tag/metrics/youtube8m/eval_util.py @@ -0,0 +1,245 @@ +# Copyright 2016 Google Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS-IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""Provides functions to help with evaluating models.""" +import datetime +import numpy + +from . import mean_average_precision_calculator as map_calculator +from . import average_precision_calculator as ap_calculator + + +def flatten(l): + """ Merges a list of lists into a single list. """ + return [item for sublist in l for item in sublist] + + +def calculate_hit_at_one(predictions, actuals): + """Performs a local (numpy) calculation of the hit at one. + + Args: + predictions: Matrix containing the outputs of the model. + Dimensions are 'batch' x 'num_classes'. + actuals: Matrix containing the ground truth labels. + Dimensions are 'batch' x 'num_classes'. + + Returns: + float: The average hit at one across the entire batch. + """ + top_prediction = numpy.argmax(predictions, 1) + hits = actuals[numpy.arange(actuals.shape[0]), top_prediction] + return numpy.average(hits) + + +def calculate_precision_at_equal_recall_rate(predictions, actuals): + """Performs a local (numpy) calculation of the PERR. + + Args: + predictions: Matrix containing the outputs of the model. + Dimensions are 'batch' x 'num_classes'. + actuals: Matrix containing the ground truth labels. + Dimensions are 'batch' x 'num_classes'. + + Returns: + float: The average precision at equal recall rate across the entire batch. + """ + aggregated_precision = 0.0 + num_videos = actuals.shape[0] + for row in numpy.arange(num_videos): + num_labels = int(numpy.sum(actuals[row])) + top_indices = numpy.argpartition(predictions[row], + -num_labels)[-num_labels:] + item_precision = 0.0 + for label_index in top_indices: + if predictions[row][label_index] > 0: + item_precision += actuals[row][label_index] + item_precision /= top_indices.size + aggregated_precision += item_precision + aggregated_precision /= num_videos + return aggregated_precision + + +def calculate_gap(predictions, actuals, top_k=20): + """Performs a local (numpy) calculation of the global average precision. + + Only the top_k predictions are taken for each of the videos. + + Args: + predictions: Matrix containing the outputs of the model. + Dimensions are 'batch' x 'num_classes'. + actuals: Matrix containing the ground truth labels. + Dimensions are 'batch' x 'num_classes'. + top_k: How many predictions to use per video. + + Returns: + float: The global average precision. + """ + gap_calculator = ap_calculator.AveragePrecisionCalculator() + sparse_predictions, sparse_labels, num_positives = top_k_by_class( + predictions, actuals, top_k) + gap_calculator.accumulate( + flatten(sparse_predictions), flatten(sparse_labels), sum(num_positives)) + return gap_calculator.peek_ap_at_n() + + +def top_k_by_class(predictions, labels, k=20): + """Extracts the top k predictions for each video, sorted by class. + + Args: + predictions: A numpy matrix containing the outputs of the model. + Dimensions are 'batch' x 'num_classes'. + k: the top k non-zero entries to preserve in each prediction. + + Returns: + A tuple (predictions,labels, true_positives). 'predictions' and 'labels' + are lists of lists of floats. 'true_positives' is a list of scalars. The + length of the lists are equal to the number of classes. The entries in the + predictions variable are probability predictions, and + the corresponding entries in the labels variable are the ground truth for + those predictions. The entries in 'true_positives' are the number of true + positives for each class in the ground truth. + + Raises: + ValueError: An error occurred when the k is not a positive integer. + """ + if k <= 0: + raise ValueError("k must be a positive integer.") + k = min(k, predictions.shape[1]) + num_classes = predictions.shape[1] + prediction_triplets = [] + for video_index in range(predictions.shape[0]): + prediction_triplets.extend( + top_k_triplets(predictions[video_index], labels[video_index], k)) + out_predictions = [[] for v in range(num_classes)] + out_labels = [[] for v in range(num_classes)] + for triplet in prediction_triplets: + out_predictions[triplet[0]].append(triplet[1]) + out_labels[triplet[0]].append(triplet[2]) + out_true_positives = [numpy.sum(labels[:, i]) for i in range(num_classes)] + + return out_predictions, out_labels, out_true_positives + + +def top_k_triplets(predictions, labels, k=20): + """Get the top_k for a 1-d numpy array. Returns a sparse list of tuples in + (prediction, class) format""" + m = len(predictions) + k = min(k, m) + indices = numpy.argpartition(predictions, -k)[-k:] + return [(index, predictions[index], labels[index]) for index in indices] + + +class EvaluationMetrics(object): + """A class to store the evaluation metrics.""" + + def __init__(self, num_class, top_k): + """Construct an EvaluationMetrics object to store the evaluation metrics. + + Args: + num_class: A positive integer specifying the number of classes. + top_k: A positive integer specifying how many predictions are considered per video. + + Raises: + ValueError: An error occurred when MeanAveragePrecisionCalculator cannot + not be constructed. + """ + self.sum_hit_at_one = 0.0 + self.sum_perr = 0.0 + self.sum_loss = 0.0 + self.map_calculator = map_calculator.MeanAveragePrecisionCalculator( + num_class) + self.global_ap_calculator = ap_calculator.AveragePrecisionCalculator() + self.top_k = top_k + self.num_examples = 0 + + #def accumulate(self, predictions, labels, loss): + def accumulate(self, loss, predictions, labels): + """Accumulate the metrics calculated locally for this mini-batch. + + Args: + predictions: A numpy matrix containing the outputs of the model. + Dimensions are 'batch' x 'num_classes'. + labels: A numpy matrix containing the ground truth labels. + Dimensions are 'batch' x 'num_classes'. + loss: A numpy array containing the loss for each sample. + + Returns: + dictionary: A dictionary storing the metrics for the mini-batch. + + Raises: + ValueError: An error occurred when the shape of predictions and actuals + does not match. + """ + batch_size = labels.shape[0] + mean_hit_at_one = calculate_hit_at_one(predictions, labels) + mean_perr = calculate_precision_at_equal_recall_rate(predictions, + labels) + mean_loss = numpy.mean(loss) + + # Take the top 20 predictions. + sparse_predictions, sparse_labels, num_positives = top_k_by_class( + predictions, labels, self.top_k) + self.map_calculator.accumulate(sparse_predictions, sparse_labels, + num_positives) + self.global_ap_calculator.accumulate( + flatten(sparse_predictions), + flatten(sparse_labels), sum(num_positives)) + + self.num_examples += batch_size + self.sum_hit_at_one += mean_hit_at_one * batch_size + self.sum_perr += mean_perr * batch_size + self.sum_loss += mean_loss * batch_size + + return { + "hit_at_one": mean_hit_at_one, + "perr": mean_perr, + "loss": mean_loss + } + + def get(self): + """Calculate the evaluation metrics for the whole epoch. + + Raises: + ValueError: If no examples were accumulated. + + Returns: + dictionary: a dictionary storing the evaluation metrics for the epoch. The + dictionary has the fields: avg_hit_at_one, avg_perr, avg_loss, and + aps (default nan). + """ + if self.num_examples <= 0: + raise ValueError("total_sample must be positive.") + avg_hit_at_one = self.sum_hit_at_one / self.num_examples + avg_perr = self.sum_perr / self.num_examples + avg_loss = self.sum_loss / self.num_examples + + aps = self.map_calculator.peek_map_at_n() + gap = self.global_ap_calculator.peek_ap_at_n() + + epoch_info_dict = {} + return { + "avg_hit_at_one": avg_hit_at_one, + "avg_perr": avg_perr, + "avg_loss": avg_loss, + "aps": aps, + "gap": gap + } + + def clear(self): + """Clear the evaluation metrics and reset the EvaluationMetrics object.""" + self.sum_hit_at_one = 0.0 + self.sum_perr = 0.0 + self.sum_loss = 0.0 + self.map_calculator.clear() + self.global_ap_calculator.clear() + self.num_examples = 0 diff --git a/PaddleCV/video/application/video_tag/metrics/youtube8m/mean_average_precision_calculator.py b/PaddleCV/video/application/video_tag/metrics/youtube8m/mean_average_precision_calculator.py new file mode 100644 index 0000000000000000000000000000000000000000..0ae8b0ed3717aba13b7ed35b4af025be40423967 --- /dev/null +++ b/PaddleCV/video/application/video_tag/metrics/youtube8m/mean_average_precision_calculator.py @@ -0,0 +1,114 @@ +# Copyright 2016 Google Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS-IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""Calculate the mean average precision. + +It provides an interface for calculating mean average precision +for an entire list or the top-n ranked items. + +Example usages: +We first call the function accumulate many times to process parts of the ranked +list. After processing all the parts, we call peek_map_at_n +to calculate the mean average precision. + +``` +import random + +p = np.array([[random.random() for _ in xrange(50)] for _ in xrange(1000)]) +a = np.array([[random.choice([0, 1]) for _ in xrange(50)] + for _ in xrange(1000)]) + +# mean average precision for 50 classes. +calculator = mean_average_precision_calculator.MeanAveragePrecisionCalculator( + num_class=50) +calculator.accumulate(p, a) +aps = calculator.peek_map_at_n() +``` +""" + +import numpy +from . import average_precision_calculator + + +class MeanAveragePrecisionCalculator(object): + """This class is to calculate mean average precision. + """ + + def __init__(self, num_class): + """Construct a calculator to calculate the (macro) average precision. + + Args: + num_class: A positive Integer specifying the number of classes. + top_n_array: A list of positive integers specifying the top n for each + class. The top n in each class will be used to calculate its average + precision at n. + The size of the array must be num_class. + + Raises: + ValueError: An error occurred when num_class is not a positive integer; + or the top_n_array is not a list of positive integers. + """ + if not isinstance(num_class, int) or num_class <= 1: + raise ValueError("num_class must be a positive integer.") + + self._ap_calculators = [] # member of AveragePrecisionCalculator + self._num_class = num_class # total number of classes + for i in range(num_class): + self._ap_calculators.append( + average_precision_calculator.AveragePrecisionCalculator()) + + def accumulate(self, predictions, actuals, num_positives=None): + """Accumulate the predictions and their ground truth labels. + + Args: + predictions: A list of lists storing the prediction scores. The outer + dimension corresponds to classes. + actuals: A list of lists storing the ground truth labels. The dimensions + should correspond to the predictions input. Any value + larger than 0 will be treated as positives, otherwise as negatives. + num_positives: If provided, it is a list of numbers representing the + number of true positives for each class. If not provided, the number of + true positives will be inferred from the 'actuals' array. + + Raises: + ValueError: An error occurred when the shape of predictions and actuals + does not match. + """ + if not num_positives: + num_positives = [None for i in predictions.shape[1]] + + calculators = self._ap_calculators + for i in range(len(predictions)): + calculators[i].accumulate(predictions[i], actuals[i], + num_positives[i]) + + def clear(self): + for calculator in self._ap_calculators: + calculator.clear() + + def is_empty(self): + return ([calculator.heap_size for calculator in self._ap_calculators] == + [0 for _ in range(self._num_class)]) + + def peek_map_at_n(self): + """Peek the non-interpolated mean average precision at n. + + Returns: + An array of non-interpolated average precision at n (default 0) for each + class. + """ + aps = [ + self._ap_calculators[i].peek_ap_at_n() + for i in range(self._num_class) + ] + return aps diff --git a/PaddleCV/video/application/video_tag/models/__init__.py b/PaddleCV/video/application/video_tag/models/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..4a3adbbfb4ee895e532f03bb2d392ef88dcd4dcf --- /dev/null +++ b/PaddleCV/video/application/video_tag/models/__init__.py @@ -0,0 +1,7 @@ +from .model import regist_model, get_model +from .attention_lstm import AttentionLSTM +from .tsn import TSN + +# regist models, sort by alphabet +regist_model("AttentionLSTM", AttentionLSTM) +regist_model("TSN", TSN) diff --git a/PaddleCV/video/application/video_tag/models/attention_lstm/__init__.py b/PaddleCV/video/application/video_tag/models/attention_lstm/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..cb872f0e43ab52054b42970896e5791a0eeb691d --- /dev/null +++ b/PaddleCV/video/application/video_tag/models/attention_lstm/__init__.py @@ -0,0 +1 @@ +from .attention_lstm import * diff --git a/PaddleCV/video/application/video_tag/models/attention_lstm/attention_lstm.py b/PaddleCV/video/application/video_tag/models/attention_lstm/attention_lstm.py new file mode 100644 index 0000000000000000000000000000000000000000..dbf417ae98b9d27cd858d9b6ac66973ccde917f2 --- /dev/null +++ b/PaddleCV/video/application/video_tag/models/attention_lstm/attention_lstm.py @@ -0,0 +1,149 @@ +# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import numpy as np + +import paddle.fluid as fluid +from paddle.fluid import ParamAttr + +from ..model import ModelBase +from .lstm_attention import LSTMAttentionModel + +import logging +logger = logging.getLogger(__name__) + +__all__ = ["AttentionLSTM"] + + +class AttentionLSTM(ModelBase): + def __init__(self, name, cfg, mode='train'): + super(AttentionLSTM, self).__init__(name, cfg, mode) + self.get_config() + + def get_config(self): + # get model configs + self.feature_num = self.cfg.MODEL.feature_num + self.feature_names = self.cfg.MODEL.feature_names + self.feature_dims = self.cfg.MODEL.feature_dims + self.num_classes = self.cfg.MODEL.num_classes + self.embedding_size = self.cfg.MODEL.embedding_size + self.lstm_size = self.cfg.MODEL.lstm_size + self.drop_rate = self.cfg.MODEL.drop_rate + + # get mode configs + self.batch_size = self.get_config_from_sec(self.mode, 'batch_size', 1) + self.num_gpus = self.get_config_from_sec(self.mode, 'num_gpus', 1) + + def build_input(self, use_dataloader): + self.feature_input = [] + for name, dim in zip(self.feature_names, self.feature_dims): + self.feature_input.append( + fluid.data( + shape=[None, dim], lod_level=1, dtype='float32', name=name)) + #video_tag without label_input + if use_dataloader: + assert self.mode != 'infer', \ + 'dataloader is not recommendated when infer, please set use_dataloader to be false.' + self.dataloader = fluid.io.DataLoader.from_generator( + feed_list=self.feature_input, #video_tag + capacity=8, + iterable=True) + + def build_model(self): + att_outs = [] + for i, (input_dim, feature + ) in enumerate(zip(self.feature_dims, self.feature_input)): + att = LSTMAttentionModel(input_dim, self.embedding_size, + self.lstm_size, self.drop_rate) + att_out = att.forward(feature, is_training=(self.mode == 'train')) + att_outs.append(att_out) + if len(att_outs) > 1: + out = fluid.layers.concat(att_outs, axis=1) + else: + out = att_outs[0] + + fc1 = fluid.layers.fc( + input=out, + size=8192, + act='relu', + bias_attr=ParamAttr( + regularizer=fluid.regularizer.L2Decay(0.0), + initializer=fluid.initializer.NormalInitializer(scale=0.0)), + name='fc1') + fc2 = fluid.layers.fc( + input=fc1, + size=4096, + act='tanh', + bias_attr=ParamAttr( + regularizer=fluid.regularizer.L2Decay(0.0), + initializer=fluid.initializer.NormalInitializer(scale=0.0)), + name='fc2') + + self.logit = fluid.layers.fc(input=fc2, size=self.num_classes, act=None, \ + bias_attr=ParamAttr(regularizer=fluid.regularizer.L2Decay(0.0), + initializer=fluid.initializer.NormalInitializer(scale=0.0)), + name = 'output') + + self.output = fluid.layers.sigmoid(self.logit) + + def optimizer(self): + assert self.mode == 'train', "optimizer only can be get in train mode" + values = [ + self.learning_rate * (self.decay_gamma**i) + for i in range(len(self.decay_epochs) + 1) + ] + iter_per_epoch = self.num_samples / self.batch_size + boundaries = [e * iter_per_epoch for e in self.decay_epochs] + return fluid.optimizer.RMSProp( + learning_rate=fluid.layers.piecewise_decay( + values=values, boundaries=boundaries), + centered=True, + regularization=fluid.regularizer.L2Decay(self.weight_decay)) + + def loss(self): + assert self.mode != 'infer', "invalid loss calculationg in infer mode" + cost = fluid.layers.sigmoid_cross_entropy_with_logits( + x=self.logit, label=self.label_input) + cost = fluid.layers.reduce_sum(cost, dim=-1) + sum_cost = fluid.layers.reduce_sum(cost) + self.loss_ = fluid.layers.scale( + sum_cost, scale=self.num_gpus, bias_after_scale=False) + return self.loss_ + + def outputs(self): + return [self.output, self.logit] + + def feeds(self): + return self.feature_input + + def fetches(self): + fetch_list = [self.output] + return fetch_list + + def weights_info(self): + return None + + def load_pretrain_params(self, exe, pretrain, prog, place): + logger.info("Load pretrain weights from {}, exclude fc layer.".format( + pretrain)) + + state_dict = fluid.load_program_state(pretrain) + dict_keys = list(state_dict.keys()) + for name in dict_keys: + if "fc_0" in name: + del state_dict[name] + logger.info( + 'Delete {} from pretrained parameters. Do not load it'. + format(name)) + fluid.set_program_state(prog, state_dict) diff --git a/PaddleCV/video/application/video_tag/models/attention_lstm/lstm_attention.py b/PaddleCV/video/application/video_tag/models/attention_lstm/lstm_attention.py new file mode 100644 index 0000000000000000000000000000000000000000..baca36c13b663bd2c4589a2876f72a731a1ec487 --- /dev/null +++ b/PaddleCV/video/application/video_tag/models/attention_lstm/lstm_attention.py @@ -0,0 +1,87 @@ +# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import paddle.fluid as fluid +from paddle.fluid import ParamAttr +import numpy as np + + +class LSTMAttentionModel(object): + """LSTM Attention Model""" + + def __init__(self, + bias_attr, + embedding_size=512, + lstm_size=1024, + drop_rate=0.5): + self.lstm_size = lstm_size + self.embedding_size = embedding_size + self.drop_rate = drop_rate + + def forward(self, input, is_training): + input_fc = fluid.layers.fc( + input=input, + size=self.embedding_size, + act='tanh', + bias_attr=ParamAttr( + regularizer=fluid.regularizer.L2Decay(0.0), + initializer=fluid.initializer.NormalInitializer(scale=0.0)), + name='rgb_fc') + + lstm_forward_fc = fluid.layers.fc( + input=input_fc, + size=self.lstm_size * 4, + act=None, + bias_attr=False, # video_tag + name='rgb_fc_forward') + + lstm_forward, _ = fluid.layers.dynamic_lstm( + input=lstm_forward_fc, + size=self.lstm_size * 4, + is_reverse=False, + name='rgb_lstm_forward') + + lsmt_backward_fc = fluid.layers.fc( + input=input_fc, + size=self.lstm_size * 4, + act=None, + bias_attr=False, #video_tag + name='rgb_fc_backward') + + lstm_backward, _ = fluid.layers.dynamic_lstm( + input=lsmt_backward_fc, + size=self.lstm_size * 4, + is_reverse=True, + name='rgb_lstm_backward') + + lstm_concat = fluid.layers.concat( + input=[lstm_forward, lstm_backward], axis=1) + + lstm_dropout = fluid.layers.dropout( + x=lstm_concat, + dropout_prob=self.drop_rate, + is_test=(not is_training)) + + lstm_weight = fluid.layers.fc( + input=lstm_dropout, + size=1, + act='sequence_softmax', + bias_attr=False, #video_tag + name='rgb_weight') + + scaled = fluid.layers.elementwise_mul( + x=lstm_dropout, y=lstm_weight, axis=0) + lstm_pool = fluid.layers.sequence_pool(input=scaled, pool_type='sum') + + return lstm_pool diff --git a/PaddleCV/video/application/video_tag/models/model.py b/PaddleCV/video/application/video_tag/models/model.py new file mode 100644 index 0000000000000000000000000000000000000000..512502ae763355622ca2e6ec27a9187c905ac450 --- /dev/null +++ b/PaddleCV/video/application/video_tag/models/model.py @@ -0,0 +1,191 @@ +# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import os +import wget +import logging +try: + from configparser import ConfigParser +except: + from ConfigParser import ConfigParser + +import paddle.fluid as fluid +from .utils import download, AttrDict + +WEIGHT_DIR = os.path.join(os.path.expanduser('~'), '.paddle', 'weights') + +logger = logging.getLogger(__name__) + + +def is_parameter(var): + return isinstance(var, fluid.framework.Parameter) + + +class NotImplementError(Exception): + "Error: model function not implement" + + def __init__(self, model, function): + super(NotImplementError, self).__init__() + self.model = model.__class__.__name__ + self.function = function.__name__ + + def __str__(self): + return "Function {}() is not implemented in model {}".format( + self.function, self.model) + + +class ModelNotFoundError(Exception): + "Error: model not found" + + def __init__(self, model_name, avail_models): + super(ModelNotFoundError, self).__init__() + self.model_name = model_name + self.avail_models = avail_models + + def __str__(self): + msg = "Model {} Not Found.\nAvailiable models:\n".format( + self.model_name) + for model in self.avail_models: + msg += " {}\n".format(model) + return msg + + +class ModelBase(object): + def __init__(self, name, cfg, mode='train'): + assert mode in ['train', 'valid', 'test', 'infer'], \ + "Unknown mode type {}".format(mode) + self.name = name + self.is_training = (mode == 'train') + self.mode = mode + self.cfg = cfg + self.dataloader = None + + def build_model(self): + "build model struct" + raise NotImplementError(self, self.build_model) + + def build_input(self, use_dataloader): + "build input Variable" + raise NotImplementError(self, self.build_input) + + def optimizer(self): + "get model optimizer" + raise NotImplementError(self, self.optimizer) + + def outputs(): + "get output variable" + raise notimplementerror(self, self.outputs) + + def loss(self): + "get loss variable" + raise notimplementerror(self, self.loss) + + def feeds(self): + "get feed inputs list" + raise NotImplementError(self, self.feeds) + + def fetches(self): + "get fetch list of model" + raise NotImplementError(self, self.fetches) + + def weights_info(self): + "get model weight default path and download url" + raise NotImplementError(self, self.weights_info) + + def get_weights(self): + "get model weight file path, download weight from Paddle if not exist" + path, url = self.weights_info() + path = os.path.join(WEIGHT_DIR, path) + if not os.path.isdir(WEIGHT_DIR): + logger.info('{} not exists, will be created automatically.'.format( + WEIGHT_DIR)) + os.makedirs(WEIGHT_DIR) + if os.path.exists(path): + return path + + logger.info("Download weights of {} from {}".format(self.name, url)) + wget.download(url, path) + return path + + def dataloader(self): + return self.dataloader + + def epoch_num(self): + "get train epoch num" + return self.cfg.TRAIN.epoch + + def pretrain_info(self): + "get pretrain base model directory" + return (None, None) + + def get_pretrain_weights(self): + "get model weight file path, download weight from Paddle if not exist" + path, url = self.pretrain_info() + if not path: + return None + + path = os.path.join(WEIGHT_DIR, path) + if not os.path.isdir(WEIGHT_DIR): + logger.info('{} not exists, will be created automatically.'.format( + WEIGHT_DIR)) + os.makedirs(WEIGHT_DIR) + if os.path.exists(path): + return path + + logger.info("Download pretrain weights of {} from {}".format(self.name, + url)) + download(url, path) + return path + + def load_pretrain_params(self, exe, pretrain, prog, place): + logger.info("Load pretrain weights from {}".format(pretrain)) + state_dict = fluid.load_program_state(pretrain) + fluid.set_program_state(prog, state_dict) + + def load_test_weights(self, exe, weights, prog): + params_list = list(filter(is_parameter, prog.list_vars())) + fluid.load(prog, weights, executor=exe, var_list=params_list) + + def get_config_from_sec(self, sec, item, default=None): + if sec.upper() not in self.cfg: + return default + return self.cfg[sec.upper()].get(item, default) + + +class ModelZoo(object): + def __init__(self): + self.model_zoo = {} + + def regist(self, name, model): + assert model.__base__ == ModelBase, "Unknow model type {}".format( + type(model)) + self.model_zoo[name] = model + + def get(self, name, cfg, mode='train'): + for k, v in self.model_zoo.items(): + if k.upper() == name.upper(): + return v(name, cfg, mode) + raise ModelNotFoundError(name, self.model_zoo.keys()) + + +# singleton model_zoo +model_zoo = ModelZoo() + + +def regist_model(name, model): + model_zoo.regist(name, model) + + +def get_model(name, cfg, mode='train'): + return model_zoo.get(name, cfg, mode) diff --git a/PaddleCV/video/application/video_tag/models/tsn/__init__.py b/PaddleCV/video/application/video_tag/models/tsn/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..bd57d2687bc948e63dd88306e9d435bbbb5a7978 --- /dev/null +++ b/PaddleCV/video/application/video_tag/models/tsn/__init__.py @@ -0,0 +1 @@ +from .tsn import * diff --git a/PaddleCV/video/application/video_tag/models/tsn/tsn.py b/PaddleCV/video/application/video_tag/models/tsn/tsn.py new file mode 100644 index 0000000000000000000000000000000000000000..4bbce1874efa143c5a178455fa1765fa6e761e34 --- /dev/null +++ b/PaddleCV/video/application/video_tag/models/tsn/tsn.py @@ -0,0 +1,172 @@ +# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import numpy as np + +import paddle.fluid as fluid +from paddle.fluid import ParamAttr + +from ..model import ModelBase +from .tsn_res_model import TSN_ResNet + +import logging +logger = logging.getLogger(__name__) + +__all__ = ["TSN"] + + +class TSN(ModelBase): + def __init__(self, name, cfg, mode='train'): + super(TSN, self).__init__(name, cfg, mode=mode) + self.get_config() + + def get_config(self): + self.num_classes = self.get_config_from_sec('model', 'num_classes') + self.seg_num = self.get_config_from_sec('model', 'seg_num') + self.seglen = self.get_config_from_sec('model', 'seglen') + self.image_mean = self.get_config_from_sec('model', 'image_mean') + self.image_std = self.get_config_from_sec('model', 'image_std') + self.num_layers = self.get_config_from_sec('model', 'num_layers') + + self.num_epochs = self.get_config_from_sec('train', 'epoch') + self.total_videos = self.get_config_from_sec('train', 'total_videos') + self.base_learning_rate = self.get_config_from_sec('train', + 'learning_rate') + self.learning_rate_decay = self.get_config_from_sec( + 'train', 'learning_rate_decay') + self.l2_weight_decay = self.get_config_from_sec('train', + 'l2_weight_decay') + self.momentum = self.get_config_from_sec('train', 'momentum') + + self.seg_num = self.get_config_from_sec(self.mode, 'seg_num', + self.seg_num) + self.target_size = self.get_config_from_sec(self.mode, 'target_size') + self.batch_size = self.get_config_from_sec(self.mode, 'batch_size') + + def build_input(self, use_dataloader=True): + image_shape = [3, self.target_size, self.target_size] + image_shape[0] = image_shape[0] * self.seglen + image_shape = [None, self.seg_num] + image_shape + self.use_dataloader = use_dataloader + + image = fluid.data(name='image', shape=image_shape, dtype='float32') + if self.mode != 'infer': + label = fluid.data(name='label', shape=[None, 1], dtype='int64') + else: + label = None + + if use_dataloader: + assert self.mode != 'infer', \ + 'dataloader is not recommendated when infer, please set use_dataloader to be false.' + self.dataloader = fluid.io.DataLoader.from_generator( + feed_list=[image, label], capacity=4, iterable=True) + + self.feature_input = [image] + self.label_input = label + + def create_model_args(self): + cfg = {} + cfg['layers'] = self.num_layers + cfg['class_dim'] = self.num_classes + cfg['seg_num'] = self.seg_num + return cfg + + def build_model(self): + cfg = self.create_model_args() + videomodel = TSN_ResNet( + layers=cfg['layers'], + seg_num=cfg['seg_num'], + is_training=(self.mode == 'train')) + out = videomodel.net(input=self.feature_input[0], + class_dim=cfg['class_dim']) + # videotag just need extractor feature + self.feature_output = out + + def optimizer(self): + assert self.mode == 'train', "optimizer only can be get in train mode" + epoch_points = [self.num_epochs / 3, self.num_epochs * 2 / 3] + total_videos = self.total_videos + step = int(total_videos / self.batch_size + 1) + bd = [e * step for e in epoch_points] + base_lr = self.base_learning_rate + lr_decay = self.learning_rate_decay + lr = [base_lr, base_lr * lr_decay, base_lr * lr_decay * lr_decay] + l2_weight_decay = self.l2_weight_decay + momentum = self.momentum + optimizer = fluid.optimizer.Momentum( + learning_rate=fluid.layers.piecewise_decay( + boundaries=bd, values=lr), + momentum=momentum, + regularization=fluid.regularizer.L2Decay(l2_weight_decay)) + + return optimizer + + def loss(self): + assert self.mode != 'infer', "invalid loss calculationg in infer mode" + cost = fluid.layers.cross_entropy(input=self.network_outputs[0], \ + label=self.label_input, ignore_index=-1) + self.loss_ = fluid.layers.mean(x=cost) + return self.loss_ + + def outputs(self): + return self.network_outputs + + def feeds(self): + return self.feature_input if self.mode == 'infer' else self.feature_input + [ + self.label_input + ] + + def fetches(self): + if self.mode == 'train' or self.mode == 'valid': + losses = self.loss() + fetch_list = [losses, self.network_outputs[0], self.label_input] + elif self.mode == 'test': + losses = self.loss() + fetch_list = [self.feature_output, self.label_input] + elif self.mode == 'infer': + fetch_list = self.feature_output + else: + raise NotImplementedError('mode {} not implemented'.format( + self.mode)) + + return fetch_list + + def pretrain_info(self): + return ( + 'ResNet50_pretrained', + 'https://paddlemodels.bj.bcebos.com/video_classification/ResNet50_pretrained.tar.gz' + ) + + def weights_info(self): + return None + + def load_pretrain_params(self, exe, pretrain, prog, place): + def is_parameter(var): + return isinstance(var, fluid.framework.Parameter) + + params_list = list(filter(is_parameter, prog.list_vars())) + for param in params_list: + print(param.name) + + logger.info("Load pretrain weights from {}, exclude fc layer.".format( + pretrain)) + + state_dict = fluid.load_program_state(pretrain) + dict_keys = list(state_dict.keys()) + for name in dict_keys: + if "fc_0" in name: + del state_dict[name] + print('Delete {} from pretrained parameters. Do not load it'. + format(name)) + fluid.set_program_state(prog, state_dict) diff --git a/PaddleCV/video/application/video_tag/models/tsn/tsn_res_model.py b/PaddleCV/video/application/video_tag/models/tsn/tsn_res_model.py new file mode 100644 index 0000000000000000000000000000000000000000..05027bb2bfeb3379095ee9b49483f7e8618686b8 --- /dev/null +++ b/PaddleCV/video/application/video_tag/models/tsn/tsn_res_model.py @@ -0,0 +1,150 @@ +# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import os +import time +import sys +import paddle.fluid as fluid +import math + + +class TSN_ResNet(): + def __init__(self, layers=50, seg_num=7, is_training=True): + self.layers = 101 #layers + self.seg_num = seg_num + self.is_training = is_training + + def conv_bn_layer(self, + input, + num_filters, + filter_size, + stride=1, + groups=1, + act=None, + name=None): + conv = fluid.layers.conv2d( + input=input, + num_filters=num_filters, + filter_size=filter_size, + stride=stride, + padding=(filter_size - 1) // 2, + groups=groups, + act=None, + param_attr=fluid.param_attr.ParamAttr(name=name + "_weights"), + bias_attr=False) + if name == "conv1": + bn_name = "bn_" + name + else: + bn_name = "bn" + name[3:] + + return fluid.layers.batch_norm( + input=conv, + act=act, + is_test=(not self.is_training), + param_attr=fluid.param_attr.ParamAttr(name=bn_name + "_scale"), + bias_attr=fluid.param_attr.ParamAttr(bn_name + '_offset'), + moving_mean_name=bn_name + "_mean", + moving_variance_name=bn_name + '_variance') + + def shortcut(self, input, ch_out, stride, name): + ch_in = input.shape[1] + if ch_in != ch_out or stride != 1: + return self.conv_bn_layer(input, ch_out, 1, stride, name=name) + else: + return input + + def bottleneck_block(self, input, num_filters, stride, name): + conv0 = self.conv_bn_layer( + input=input, + num_filters=num_filters, + filter_size=1, + act='relu', + name=name + "_branch2a") + conv1 = self.conv_bn_layer( + input=conv0, + num_filters=num_filters, + filter_size=3, + stride=stride, + act='relu', + name=name + "_branch2b") + conv2 = self.conv_bn_layer( + input=conv1, + num_filters=num_filters * 4, + filter_size=1, + act=None, + name=name + "_branch2c") + + short = self.shortcut( + input, num_filters * 4, stride, name=name + "_branch1") + + return fluid.layers.elementwise_add(x=short, y=conv2, act='relu') + + def net(self, input, class_dim=101): + layers = self.layers + seg_num = self.seg_num + supported_layers = [50, 101, 152] + assert layers in supported_layers, \ + "supported layers are {} but input layer is {}".format(supported_layers, layers) + + # reshape input + channels = input.shape[2] + short_size = input.shape[3] + input = fluid.layers.reshape( + x=input, shape=[-1, channels, short_size, short_size]) + + if layers == 50: + depth = [3, 4, 6, 3] + elif layers == 101: + depth = [3, 4, 23, 3] + elif layers == 152: + depth = [3, 8, 36, 3] + num_filters = [64, 128, 256, 512] + + conv = self.conv_bn_layer( + input=input, + num_filters=64, + filter_size=7, + stride=2, + act='relu', + name='conv1') + conv = fluid.layers.pool2d( + input=conv, + pool_size=3, + pool_stride=2, + pool_padding=1, + pool_type='max') + + for block in range(len(depth)): + for i in range(depth[block]): + if layers in [101, 152] and block == 2: + if i == 0: + conv_name = "res" + str(block + 2) + "a" + else: + conv_name = "res" + str(block + 2) + "b" + str(i) + else: + conv_name = "res" + str(block + 2) + chr(97 + i) + + conv = self.bottleneck_block( + input=conv, + num_filters=num_filters[block], + stride=2 if i == 0 and block != 0 else 1, + name=conv_name) + + pool = fluid.layers.pool2d( + input=conv, pool_size=7, pool_type='avg', global_pooling=True) + + # video_tag just need extractor feature + feature = fluid.layers.reshape( + x=pool, shape=[-1, seg_num, pool.shape[1]]) + return feature diff --git a/PaddleCV/video/application/video_tag/models/utils.py b/PaddleCV/video/application/video_tag/models/utils.py new file mode 100644 index 0000000000000000000000000000000000000000..3480794285d0b2da3832c25ff3512c5678e2b0e1 --- /dev/null +++ b/PaddleCV/video/application/video_tag/models/utils.py @@ -0,0 +1,47 @@ +# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import os +import wget +import tarfile + +__all__ = ['decompress', 'download', 'AttrDict'] + + +def decompress(path): + t = tarfile.open(path) + t.extractall(path=os.path.split(path)[0]) + t.close() + os.remove(path) + + +def download(url, path): + weight_dir = os.path.split(path)[0] + if not os.path.exists(weight_dir): + os.makedirs(weight_dir) + + path = path + ".tar.gz" + wget.download(url, path) + decompress(path) + + +class AttrDict(dict): + def __getattr__(self, key): + return self[key] + + def __setattr__(self, key, value): + if key in self.__dict__: + self.__dict__[key] = value + else: + self[key] = value diff --git a/PaddleCV/video/application/video_tag/reader/__init__.py b/PaddleCV/video/application/video_tag/reader/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..d419ab75df3a105329c65f7d96a78f3b1964823c --- /dev/null +++ b/PaddleCV/video/application/video_tag/reader/__init__.py @@ -0,0 +1,5 @@ +from .reader_utils import regist_reader, get_reader +from .kinetics_reader import KineticsReader + +# regist reader, sort by alphabet +regist_reader("TSN", KineticsReader) diff --git a/PaddleCV/video/application/video_tag/reader/kinetics_reader.py b/PaddleCV/video/application/video_tag/reader/kinetics_reader.py new file mode 100644 index 0000000000000000000000000000000000000000..4eb560a111b2c47752226ba0776158d657280cf1 --- /dev/null +++ b/PaddleCV/video/application/video_tag/reader/kinetics_reader.py @@ -0,0 +1,255 @@ +# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import os +import sys +import cv2 +import math +import random +import functools +import time +try: + import cPickle as pickle + from cStringIO import StringIO +except ImportError: + import pickle + from io import BytesIO +import numpy as np +import paddle.fluid as fluid +from PIL import Image, ImageEnhance +import logging + +from .reader_utils import DataReader + +logger = logging.getLogger(__name__) +python_ver = sys.version_info + + +class KineticsReader(DataReader): + """ + Data reader for kinetics dataset of two format mp4 and pkl. + 1. mp4, the original format of kinetics400 + 2. pkl, the mp4 was decoded previously and stored as pkl + In both case, load the data, and then get the frame data in the form of numpy and label as an integer. + dataset cfg: format + num_classes + seg_num + short_size + target_size + num_reader_threads + buf_size + image_mean + image_std + batch_size + list + """ + + def __init__(self, name, mode, cfg): + super(KineticsReader, self).__init__(name, mode, cfg) + self.format = cfg.MODEL.format + self.num_classes = self.get_config_from_sec('model', 'num_classes') + self.seg_num = self.get_config_from_sec('model', 'seg_num') + self.seglen = self.get_config_from_sec('model', 'seglen') + + self.seg_num = self.get_config_from_sec(mode, 'seg_num', self.seg_num) + self.short_size = self.get_config_from_sec(mode, 'short_size') + self.target_size = self.get_config_from_sec(mode, 'target_size') + self.num_reader_threads = self.get_config_from_sec(mode, + 'num_reader_threads') + self.buf_size = self.get_config_from_sec(mode, 'buf_size') + self.fix_random_seed = self.get_config_from_sec(mode, 'fix_random_seed') + + self.img_mean = np.array(cfg.MODEL.image_mean).reshape( + [3, 1, 1]).astype(np.float32) + self.img_std = np.array(cfg.MODEL.image_std).reshape( + [3, 1, 1]).astype(np.float32) + # set batch size and file list + self.batch_size = cfg[mode.upper()]['batch_size'] + self.filelist = cfg[mode.upper()]['filelist'] + if self.fix_random_seed: + random.seed(0) + np.random.seed(0) + self.num_reader_threads = 1 + + def create_reader(self): + assert os.path.exists(self.filelist), \ + '{} not exist, please check the data list'.format(self.filelist) + _reader = self._reader_creator(self.filelist, self.mode, seg_num=self.seg_num, seglen = self.seglen, \ + short_size = self.short_size, target_size = self.target_size, \ + img_mean = self.img_mean, img_std = self.img_std, \ + shuffle = (self.mode == 'train'), \ + num_threads = self.num_reader_threads, \ + buf_size = self.buf_size, format = self.format) + + def _batch_reader(): + batch_out = [] + for imgs, label in _reader(): + if imgs is None: + continue + batch_out.append((imgs, label)) + if len(batch_out) == self.batch_size: + yield batch_out + batch_out = [] + + return _batch_reader + + def _reader_creator(self, + pickle_list, + mode, + seg_num, + seglen, + short_size, + target_size, + img_mean, + img_std, + shuffle=False, + num_threads=1, + buf_size=1024, + format='pkl'): + def decode_mp4(sample, mode, seg_num, seglen, short_size, target_size, + img_mean, img_std): + sample = sample[0].split(' ') + mp4_path = sample[0] + try: + load_time1 = time.time() + imgs = mp4_loader(mp4_path, seg_num, seglen, mode) + load_time2 = time.time() + if len(imgs) < 1: + logger.error('{} frame length {} less than 1.'.format( + mp4_path, len(imgs))) + return None, None + except: + logger.error('Error when loading {}'.format(mp4_path)) + return None, None + + transform_time_1 = time.time() + imgs = imgs_transform( + imgs, + mode, + seg_num, + seglen, + short_size, + target_size, + img_mean, + img_std, + name=self.name) + transform_time_2 = time.time() + return imgs, mp4_path + + def reader(): + with open(pickle_list) as flist: + lines = [line.strip() for line in flist] + if shuffle: + random.shuffle(lines) + for line in lines: + pickle_path = line.strip() + yield [pickle_path] + + mapper = functools.partial( + decode_mp4, + mode=mode, + seg_num=seg_num, + seglen=seglen, + short_size=short_size, + target_size=target_size, + img_mean=img_mean, + img_std=img_std) + + return fluid.io.xmap_readers(mapper, reader, num_threads, buf_size) + + +def imgs_transform(imgs, + mode, + seg_num, + seglen, + short_size, + target_size, + img_mean, + img_std, + name=''): + imgs = group_scale(imgs, short_size) + + np_imgs = np.array([np.array(img).astype('float32') for img in imgs]) #dhwc + np_imgs = group_center_crop(np_imgs, target_size) + np_imgs = np_imgs.transpose(0, 3, 1, 2) / 255 #dchw + np_imgs -= img_mean + np_imgs /= img_std + + return np_imgs + + +def group_center_crop(np_imgs, target_size): + d, h, w, c = np_imgs.shape + th, tw = target_size, target_size + assert (w >= target_size) and (h >= target_size), \ + "image width({}) and height({}) should be larger than crop size".format(w, h, target_size) + + h_off = int(round((h - th) / 2.)) + w_off = int(round((w - tw) / 2.)) + + img_crop = np_imgs[:, h_off:h_off + target_size, w_off:w_off + + target_size, :] + return img_crop + + +def group_scale(imgs, target_size): + resized_imgs = [] + for i in range(len(imgs)): + img = imgs[i] + w, h = img.size + if (w <= h and w == target_size) or (h <= w and h == target_size): + resized_imgs.append(img) + continue + + if w < h: + ow = target_size + oh = int(target_size * 4.0 / 3.0) + resized_imgs.append(img.resize((ow, oh), Image.BILINEAR)) + else: + oh = target_size + ow = int(target_size * 4.0 / 3.0) + resized_imgs.append(img.resize((ow, oh), Image.BILINEAR)) + + return resized_imgs + + +def mp4_loader(filepath, nsample, seglen, mode): + cap = cv2.VideoCapture(filepath) + videolen = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) + sampledFrames = [] + for i in range(videolen): + ret, frame = cap.read() + # maybe first frame is empty + if ret == False: + continue + img = frame[:, :, ::-1] + sampledFrames.append(img) + average_dur = int(len(sampledFrames) / nsample) + imgs = [] + for i in range(nsample): + idx = 0 + if average_dur >= seglen: + idx = (average_dur - 1) // 2 + idx += i * average_dur + elif average_dur >= 1: + idx += i * average_dur + else: + idx = i + + for jj in range(idx, idx + seglen): + imgbuf = sampledFrames[int(jj % len(sampledFrames))] + img = Image.fromarray(imgbuf, mode='RGB') + imgs.append(img) + + return imgs diff --git a/PaddleCV/video/application/video_tag/reader/reader_utils.py b/PaddleCV/video/application/video_tag/reader/reader_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..b3741188e11350231600b50fb7fabad72340768c --- /dev/null +++ b/PaddleCV/video/application/video_tag/reader/reader_utils.py @@ -0,0 +1,81 @@ +# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import pickle +import cv2 +import numpy as np +import random + + +class ReaderNotFoundError(Exception): + "Error: reader not found" + + def __init__(self, reader_name, avail_readers): + super(ReaderNotFoundError, self).__init__() + self.reader_name = reader_name + self.avail_readers = avail_readers + + def __str__(self): + msg = "Reader {} Not Found.\nAvailiable readers:\n".format( + self.reader_name) + for reader in self.avail_readers: + msg += " {}\n".format(reader) + return msg + + +class DataReader(object): + """data reader for video input""" + + def __init__(self, model_name, mode, cfg): + self.name = model_name + self.mode = mode + self.cfg = cfg + + def create_reader(self): + """Not implemented""" + pass + + def get_config_from_sec(self, sec, item, default=None): + if sec.upper() not in self.cfg: + return default + return self.cfg[sec.upper()].get(item, default) + + +class ReaderZoo(object): + def __init__(self): + self.reader_zoo = {} + + def regist(self, name, reader): + assert reader.__base__ == DataReader, "Unknow model type {}".format( + type(reader)) + self.reader_zoo[name] = reader + + def get(self, name, mode, cfg): + for k, v in self.reader_zoo.items(): + if k == name: + return v(name, mode, cfg) + raise ReaderNotFoundError(name, self.reader_zoo.keys()) + + +# singleton reader_zoo +reader_zoo = ReaderZoo() + + +def regist_reader(name, reader): + reader_zoo.regist(name, reader) + + +def get_reader(name, mode, cfg): + reader_model = reader_zoo.get(name, mode, cfg) + return reader_model.create_reader() diff --git a/PaddleCV/video/application/video_tag/run_TSN_LSTM.sh b/PaddleCV/video/application/video_tag/run_TSN_LSTM.sh new file mode 100644 index 0000000000000000000000000000000000000000..8c2cf7087ec8543406e8404c74ff862069287222 --- /dev/null +++ b/PaddleCV/video/application/video_tag/run_TSN_LSTM.sh @@ -0,0 +1,4 @@ +export CUDA_VISIBLE_DEVICES=0 + +# TSN + AttentionLSTM +python videotag_main.py diff --git a/PaddleCV/video/application/video_tag/utils/__init__.py b/PaddleCV/video/application/video_tag/utils/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/PaddleCV/video/application/video_tag/utils/config_utils.py b/PaddleCV/video/application/video_tag/utils/config_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..6ceb42ee5ef3b535325fa26a7d140edd767ac0b7 --- /dev/null +++ b/PaddleCV/video/application/video_tag/utils/config_utils.py @@ -0,0 +1,75 @@ +# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import yaml +from .utility import AttrDict +import logging +logger = logging.getLogger(__name__) + +CONFIG_SECS = [ + 'train', + 'valid', + 'test', + 'infer', +] + + +def parse_config(cfg_file): + """Load a config file into AttrDict""" + import yaml + with open(cfg_file, 'r') as fopen: + yaml_config = AttrDict(yaml.load(fopen, Loader=yaml.Loader)) + create_attr_dict(yaml_config) + return yaml_config + + +def create_attr_dict(yaml_config): + from ast import literal_eval + for key, value in yaml_config.items(): + if type(value) is dict: + yaml_config[key] = value = AttrDict(value) + if isinstance(value, str): + try: + value = literal_eval(value) + except BaseException: + pass + if isinstance(value, AttrDict): + create_attr_dict(yaml_config[key]) + else: + yaml_config[key] = value + return + + +def merge_configs(cfg, sec, args_dict): + assert sec in CONFIG_SECS, "invalid config section {}".format(sec) + sec_dict = getattr(cfg, sec.upper()) + for k, v in args_dict.items(): + if v is None: + continue + try: + if hasattr(sec_dict, k): + setattr(sec_dict, k, v) + except: + pass + return cfg + + +def print_configs(cfg, mode): + logger.info("---------------- {:>5} Arguments ----------------".format( + mode)) + for sec, sec_items in cfg.items(): + logger.info("{}:".format(sec)) + for k, v in sec_items.items(): + logger.info(" {}:{}".format(k, v)) + logger.info("-------------------------------------------------") diff --git a/PaddleCV/video/application/video_tag/utils/utility.py b/PaddleCV/video/application/video_tag/utils/utility.py new file mode 100644 index 0000000000000000000000000000000000000000..fa94c0ddc4296100b206a4b4529774bd1c75c773 --- /dev/null +++ b/PaddleCV/video/application/video_tag/utils/utility.py @@ -0,0 +1,71 @@ +# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import os +import sys +import signal +import logging +import paddle +import paddle.fluid as fluid + +__all__ = ['AttrDict'] + +logger = logging.getLogger(__name__) + + +def _term(sig_num, addition): + print('current pid is %s, group id is %s' % (os.getpid(), os.getpgrp())) + os.killpg(os.getpgid(os.getpid()), signal.SIGKILL) + + +signal.signal(signal.SIGTERM, _term) +signal.signal(signal.SIGINT, _term) + + +class AttrDict(dict): + def __getattr__(self, key): + return self[key] + + def __setattr__(self, key, value): + if key in self.__dict__: + self.__dict__[key] = value + else: + self[key] = value + +def check_cuda(use_cuda, err = \ + "\nYou can not set use_gpu = True in the model because you are using paddlepaddle-cpu.\n \ + Please: 1. Install paddlepaddle-gpu to run your models on GPU or 2. Set use_gpu = False to run models on CPU.\n" + ): + try: + if use_cuda == True and fluid.is_compiled_with_cuda() == False: + print(err) + sys.exit(1) + except Exception as e: + pass + + +def check_version(): + """ + Log error and exit when the installed version of paddlepaddle is + not satisfied. + """ + err = "PaddlePaddle version 1.6 or higher is required, " \ + "or a suitable develop version is satisfied as well. \n" \ + "Please make sure the version is good with your code." \ + + try: + fluid.require_version('1.6.0') + except Exception as e: + logger.error(err) + sys.exit(1) diff --git a/PaddleCV/video/application/video_tag/video_tag.png b/PaddleCV/video/application/video_tag/video_tag.png new file mode 100644 index 0000000000000000000000000000000000000000..50ada247073a8aaaf3d4004547f8772ddb31fcb2 Binary files /dev/null and b/PaddleCV/video/application/video_tag/video_tag.png differ diff --git a/PaddleCV/video/application/video_tag/videotag_main.py b/PaddleCV/video/application/video_tag/videotag_main.py new file mode 100644 index 0000000000000000000000000000000000000000..e5cb5fa9cc3a29a000748dc6078a69663d246c07 --- /dev/null +++ b/PaddleCV/video/application/video_tag/videotag_main.py @@ -0,0 +1,236 @@ +# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import os +import sys +import time +import logging +import argparse +import ast +import numpy as np +import paddle.fluid as fluid + +from utils.config_utils import * +import models +from reader import get_reader +from metrics import get_metrics +from utils.utility import check_cuda +from utils.utility import check_version + +logging.root.handlers = [] +FORMAT = '[%(levelname)s: %(filename)s: %(lineno)4d]: %(message)s' +logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout) +logger = logging.getLogger(__name__) + + +def parse_args(): + parser = argparse.ArgumentParser() + parser.add_argument( + '--extractor_config', + type=str, + default='configs/tsn.yaml', + help='path to config file of model') + parser.add_argument( + '--extractor_name', + type=str, + default='TSN', + help='extractor model name, default TSN') + parser.add_argument( + '--predictor_config', + '--pconfig', + type=str, + default='configs/attention_lstm.yaml', + help='path to config file of model') + parser.add_argument( + '--predictor_name', + '--pname', + type=str, + default='AttentionLSTM', + help='predictor model name, as AttentionLSTM, AttentionCluster, NEXTVLAD' + ) + parser.add_argument( + '--use_gpu', + type=ast.literal_eval, + default=True, + help='default use gpu.') + parser.add_argument( + '--extractor_weights', + type=str, + default='weights/tsn', + help='extractor weight path') + parser.add_argument( + '--predictor_weights', + '--pweights', + type=str, + default='weights/attention_lstm', + help='predictor weight path') + parser.add_argument( + '--filelist', + type=str, + default=None, + help='path of video data, multiple video') + parser.add_argument( + '--save_dir', type=str, default='data/results', help='output file path') + parser.add_argument( + '--label_file', + type=str, + default='label_3396.txt', + help='chinese label file path') + + args = parser.parse_args() + return args + + +def main(): + """ + Video classification model of 3000 Chinese tags. + videotag_extractor_prdictor (as videotag_TSN_AttentionLSTM) + two stages in our model: + 1. extract feature from input video(mp4 format) using extractor + 2. predict classification results from extracted feature using predictor + we implement this using two name scopes, ie. extractor_scope and predictor_scope. + """ + + if not os.path.isdir(args.save_dir): + os.makedirs(args.save_dir) + extractor_config = parse_config(args.extractor_config) + extractor_infer_config = merge_configs(extractor_config, 'infer', + vars(args)) + extractor_start_time = time.time() + extractor_scope = fluid.Scope() + with fluid.scope_guard(extractor_scope): + extractor_startup_prog = fluid.Program() + extractor_main_prog = fluid.Program() + with fluid.program_guard(extractor_main_prog, extractor_startup_prog): + with fluid.unique_name.guard(): + # build model + extractor_model = models.get_model( + args.extractor_name, extractor_infer_config, mode='infer') + extractor_model.build_input(use_dataloader=False) + extractor_model.build_model() + extractor_feeds = extractor_model.feeds() + extractor_fetch_list = extractor_model.fetches() + + place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace() + exe = fluid.Executor(place) + + exe.run(extractor_startup_prog) + + logger.info('load extractor weights from {}'.format( + args.extractor_weights)) + extractor_model.load_test_weights(exe, args.extractor_weights, + extractor_main_prog) + + # get reader and metrics + extractor_reader = get_reader(args.extractor_name, 'infer', + extractor_infer_config) + extractor_feeder = fluid.DataFeeder( + place=place, feed_list=extractor_feeds) + + feature_list = [] + file_list = [] + for idx, data in enumerate(extractor_reader()): + file_id = [item[-1] for item in data] + feed_data = [item[:-1] for item in data] + feature_out = exe.run(fetch_list=extractor_fetch_list, + feed=extractor_feeder.feed(feed_data)) + feature_list.append(feature_out[0]) #get out from list + file_list.append(file_id) + logger.info( + '========[Stage 1 Sample {} ] Extractor finished======'. + format(idx)) + extractor_end_time = time.time() + print('extractor_time', extractor_end_time - extractor_start_time) + + predictor_config = parse_config(args.predictor_config) + predictor_infer_config = merge_configs(predictor_config, 'infer', + vars(args)) + + # get Predictor input from Extractor output + predictor_feed_list = [] + for i in range(len(feature_list)): + feature_out = feature_list[i] + if args.predictor_name == "AttentionCluster": + extractor_seg_num = extractor_infer_config.INFER.seg_num + predictor_seg_num = predictor_infer_config.MODEL.seg_num + idxs = [] + stride = float(extractor_seg_num) / predictor_seg_num + for j in range(predictor_seg_num): + pos = (j + np.random.random()) * stride + idxs.append(min(extractor_seg_num - 1, int(pos))) + extractor_feature = feature_out[:, idxs, :].astype( + float) # get from bs dim + else: + extractor_feature = feature_out.astype(float) + predictor_feed_data = [extractor_feature] + predictor_feed_list.append((predictor_feed_data, file_list[i])) + + predictor_start_time = time.time() + predictor_scope = fluid.Scope() + with fluid.scope_guard(predictor_scope): + predictor_startup_prog = fluid.Program() + predictor_main_prog = fluid.Program() + with fluid.program_guard(predictor_main_prog, predictor_startup_prog): + with fluid.unique_name.guard(): + # parse config + predictor_model = models.get_model( + args.predictor_name, predictor_infer_config, mode='infer') + predictor_model.build_input(use_dataloader=False) + predictor_model.build_model() + predictor_feeds = predictor_model.feeds() + + exe.run(predictor_startup_prog) + + logger.info('load predictor weights from {}'.format( + args.predictor_weights)) + predictor_model.load_test_weights(exe, args.predictor_weights, + predictor_main_prog) + + predictor_feeder = fluid.DataFeeder( + place=place, feed_list=predictor_feeds) + predictor_fetch_list = predictor_model.fetches() + predictor_metrics = get_metrics(args.predictor_name.upper(), + 'infer', predictor_infer_config) + predictor_metrics.reset() + + for idx, data in enumerate(predictor_feed_list): + file_id = data[1] + predictor_feed_data = data[0] + final_outs = exe.run( + fetch_list=predictor_fetch_list, + feed=predictor_feeder.feed(predictor_feed_data)) + logger.info( + '=======[Stage 2 Sample {} ] Predictor finished========'. + format(idx)) + final_result_list = [item + for item in final_outs] + [file_id] + + predictor_metrics.accumulate(final_result_list) + predictor_metrics.finalize_and_log_out( + savedir=args.save_dir, label_file=args.label_file) + predictor_end_time = time.time() + print('predictor_time', predictor_end_time - predictor_start_time) + + +if __name__ == '__main__': + start_time = time.time() + args = parse_args() + print(args) + check_cuda(args.use_gpu) + check_version() + logger.info(args) + main() + end_time = time.time() + period = end_time - start_time + print('[INFER] infer finished. cost time: {}'.format(period)) diff --git a/dygraph/mnist/train.py b/dygraph/mnist/train.py index f81df8f26458c93c1f658a9bc783d14a3c5b8256..58db6f1d728090cc63b0b802e7f765c37c5036aa 100644 --- a/dygraph/mnist/train.py +++ b/dygraph/mnist/train.py @@ -99,11 +99,13 @@ class MNIST(fluid.dygraph.Layer): self.pool_2_shape = 50 * 4 * 4 SIZE = 10 scale = (2.0 / (self.pool_2_shape**2 * SIZE))**0.5 - self._fc = Linear(self.pool_2_shape, 10, - param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.NormalInitializer( - loc=0.0, scale=scale)), - act="softmax") + self._fc = Linear( + self.pool_2_shape, + 10, + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.NormalInitializer( + loc=0.0, scale=scale)), + act="softmax") def forward(self, inputs, label=None): x = self._simple_img_conv_pool_1(inputs) @@ -117,17 +119,21 @@ class MNIST(fluid.dygraph.Layer): return x +def reader_decorator(reader): + def __reader__(): + for item in reader(): + img = np.array(item[0]).astype('float32').reshape(1, 28, 28) + label = np.array(item[1]).astype('int64').reshape(1) + yield img, label + + return __reader__ + + def test_mnist(reader, model, batch_size): acc_set = [] avg_loss_set = [] for batch_id, data in enumerate(reader()): - dy_x_data = np.array([x[0].reshape(1, 28, 28) - for x in data]).astype('float32') - y_data = np.array( - [x[1] for x in data]).astype('int64').reshape(batch_size, 1) - - img = to_variable(dy_x_data) - label = to_variable(y_data) + img, label = data label.stop_gradient = True prediction, acc = model(img, label) loss = fluid.layers.cross_entropy(input=prediction, label=label) @@ -187,28 +193,33 @@ def train_mnist(args): if args.use_data_parallel: strategy = fluid.dygraph.parallel.prepare_context() mnist = MNIST() - adam = AdamOptimizer(learning_rate=0.001, parameter_list=mnist.parameters()) + adam = AdamOptimizer( + learning_rate=0.001, parameter_list=mnist.parameters()) if args.use_data_parallel: mnist = fluid.dygraph.parallel.DataParallel(mnist, strategy) train_reader = paddle.batch( - paddle.dataset.mnist.train(), batch_size=BATCH_SIZE, drop_last=True) + reader_decorator(paddle.dataset.mnist.train()), + batch_size=BATCH_SIZE, + drop_last=True) if args.use_data_parallel: train_reader = fluid.contrib.reader.distributed_batch_reader( train_reader) test_reader = paddle.batch( - paddle.dataset.mnist.test(), batch_size=BATCH_SIZE, drop_last=True) + reader_decorator(paddle.dataset.mnist.test()), + batch_size=BATCH_SIZE, + drop_last=True) + + train_loader = fluid.io.DataLoader.from_generator(capacity=10) + train_loader.set_sample_list_generator(train_reader, places=place) + + test_loader = fluid.io.DataLoader.from_generator(capacity=10) + test_loader.set_sample_list_generator(test_reader, places=place) for epoch in range(epoch_num): - for batch_id, data in enumerate(train_reader()): - dy_x_data = np.array([x[0].reshape(1, 28, 28) - for x in data]).astype('float32') - y_data = np.array( - [x[1] for x in data]).astype('int64').reshape(-1, 1) - - img = to_variable(dy_x_data) - label = to_variable(y_data) + for batch_id, data in enumerate(train_loader()): + img, label = data label.stop_gradient = True cost, acc = mnist(img, label) @@ -231,7 +242,7 @@ def train_mnist(args): epoch, batch_id, avg_loss.numpy())) mnist.eval() - test_cost, test_acc = test_mnist(test_reader, mnist, BATCH_SIZE) + test_cost, test_acc = test_mnist(test_loader, mnist, BATCH_SIZE) mnist.train() if args.ce: print("kpis\ttest_acc\t%s" % test_acc) @@ -244,7 +255,7 @@ def train_mnist(args): fluid.dygraph.parallel.Env().local_rank == 0) if save_parameters: fluid.save_dygraph(mnist.state_dict(), "save_temp") - + print("checkpoint saved") inference_mnist() diff --git a/dygraph/mobilenet/reader.py b/dygraph/mobilenet/reader.py index bba33c355ba02983c5d9d54b3bc5f2535d53cfb1..e598d19a3b44fdfcea31abd3f909c5639ba22d45 100644 --- a/dygraph/mobilenet/reader.py +++ b/dygraph/mobilenet/reader.py @@ -239,7 +239,7 @@ def process_image(sample, settings, mode, color_jitter, rotate): img /= img_std if mode == 'train' or mode == 'val': - return (img, sample[1]) + return (img, [sample[1]]) elif mode == 'test': return (img, ) diff --git a/dygraph/mobilenet/train.py b/dygraph/mobilenet/train.py index 16e27dc4fbc22675e2446dbc5ff146e1b6b5b909..547e9d45506b7cec9f84e7543d8a60fea2fadc9c 100644 --- a/dygraph/mobilenet/train.py +++ b/dygraph/mobilenet/train.py @@ -116,10 +116,8 @@ def train_mobilenet(): optimizer.set_dict(opti_dict) # 3. reader - train_data_loader, train_data = utility.create_data_loader( - is_train=True, args=args) - test_data_loader, test_data = utility.create_data_loader( - is_train=False, args=args) + train_data_loader = utility.create_data_loader(is_train=True, args=args) + test_data_loader = utility.create_data_loader(is_train=False, args=args) num_trainers = int(os.environ.get('PADDLE_TRAINERS_NUM', 1)) imagenet_reader = reader.ImageNetReader(seed=0, place_num=place_num) train_reader = imagenet_reader.train(settings=args) @@ -145,8 +143,6 @@ def train_mobilenet(): t1 = time.time() if args.max_iter and total_batch_num == args.max_iter: return - label = to_variable(label.numpy().astype('int64').reshape( - int(args.batch_size // place_num), 1)) t_start = time.time() # 4.1.1 call net() diff --git a/dygraph/mobilenet/utils/utility.py b/dygraph/mobilenet/utils/utility.py index a7bc9c883edba2e6115d3fe96a61e569b5d7407a..22314941adb4f5ee2399147562310054a3392448 100644 --- a/dygraph/mobilenet/utils/utility.py +++ b/dygraph/mobilenet/utils/utility.py @@ -309,32 +309,14 @@ def create_data_loader(is_train, args): Returns: data_loader and the input data of net, """ - image_shape = [int(m) for m in args.image_shape.split(",")] - - feed_image = fluid.data( - name="feed_image", - shape=[None] + image_shape, - dtype="float32", - lod_level=0) - - feed_label = fluid.data( - name="feed_label", shape=[None, 1], dtype="int64", lod_level=0) - feed_y_a = fluid.data( - name="feed_y_a", shape=[None, 1], dtype="int64", lod_level=0) - if is_train and args.use_mixup: - feed_y_b = fluid.data( - name="feed_y_b", shape=[None, 1], dtype="int64", lod_level=0) - feed_lam = fluid.data( - name="feed_lam", shape=[None, 1], dtype="float32", lod_level=0) - data_loader = fluid.io.DataLoader.from_generator( capacity=64, use_double_buffer=True, iterable=True, return_list=True) - return data_loader, [feed_image, feed_y_a, feed_y_b, feed_lam] + return data_loader else: data_loader = fluid.io.DataLoader.from_generator( capacity=64, @@ -342,7 +324,7 @@ def create_data_loader(is_train, args): iterable=True, return_list=True) - return data_loader, [feed_image, feed_label] + return data_loader def print_info(pass_id, batch_id, print_step, metrics, time_info, info_mode): diff --git a/dygraph/ptb_lm/ptb_dy.py b/dygraph/ptb_lm/ptb_dy.py index d33e64194c33c5a4c7ddedbda405daa58fe330ae..0a8ed9494e16937ac1fac4068e37b2a6415212bb 100644 --- a/dygraph/ptb_lm/ptb_dy.py +++ b/dygraph/ptb_lm/ptb_dy.py @@ -1,461 +1,474 @@ -# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from __future__ import print_function - -import os -import unittest -import paddle.fluid as fluid -import paddle.fluid.core as core -from paddle.fluid.dygraph.nn import Embedding -import paddle.fluid.framework as framework -from paddle.fluid.optimizer import SGDOptimizer -from paddle.fluid.dygraph.base import to_variable -import numpy as np -import six -import multiprocessing - -import reader -import model_check -import time - -from args import * - -#import fluid.clip as clip -#from fluid.clip import * - -import sys -if sys.version[0] == '2': - reload(sys) - sys.setdefaultencoding("utf-8") - - -class SimpleLSTMRNN(fluid.Layer): - def __init__(self, - hidden_size, - num_steps, - num_layers=2, - init_scale=0.1, - dropout=None): - super(SimpleLSTMRNN, self).__init__() - self._hidden_size = hidden_size - self._num_layers = num_layers - self._init_scale = init_scale - self._dropout = dropout - self._num_steps = num_steps - self.cell_array = [] - self.hidden_array = [] - - self.weight_1_arr = [] - self.weight_2_arr = [] - self.bias_arr = [] - self.mask_array = [] - - for i in range(self._num_layers): - weight_1 = self.create_parameter( - attr=fluid.ParamAttr( - initializer=fluid.initializer.UniformInitializer( - low=-self._init_scale, high=self._init_scale)), - shape=[self._hidden_size * 2, self._hidden_size * 4], - dtype="float32", - default_initializer=fluid.initializer.UniformInitializer( - low=-self._init_scale, high=self._init_scale)) - self.weight_1_arr.append(self.add_parameter('w_%d' % i, weight_1)) - bias_1 = self.create_parameter( - attr=fluid.ParamAttr( - initializer=fluid.initializer.UniformInitializer( - low=-self._init_scale, high=self._init_scale)), - shape=[self._hidden_size * 4], - dtype="float32", - default_initializer=fluid.initializer.Constant(0.0)) - self.bias_arr.append(self.add_parameter('b_%d' % i, bias_1)) - - def forward(self, input_embedding, init_hidden=None, init_cell=None): - cell_array = [] - hidden_array = [] - - for i in range(self._num_layers): - hidden_array.append(init_hidden[i]) - cell_array.append(init_cell[i]) - - res = [] - for index in range(self._num_steps): - step_input = input_embedding[:,index,:] - for k in range(self._num_layers): - pre_hidden = hidden_array[k] - pre_cell = cell_array[k] - weight_1 = self.weight_1_arr[k] - bias = self.bias_arr[k] - - nn = fluid.layers.concat([step_input, pre_hidden], 1) - gate_input = fluid.layers.matmul(x=nn, y=weight_1) - - gate_input = fluid.layers.elementwise_add(gate_input, bias) - i, j, f, o = fluid.layers.split( - gate_input, num_or_sections=4, dim=-1) - c = pre_cell * fluid.layers.sigmoid(f) + fluid.layers.sigmoid( - i) * fluid.layers.tanh(j) - m = fluid.layers.tanh(c) * fluid.layers.sigmoid(o) - hidden_array[k] = m - cell_array[k] = c - step_input = m - - if self._dropout is not None and self._dropout > 0.0: - step_input = fluid.layers.dropout( - step_input, - dropout_prob=self._dropout, - dropout_implementation='upscale_in_train') - res.append(step_input) - real_res = fluid.layers.concat(res, 1) - real_res = fluid.layers.reshape(real_res, [ -1, self._num_steps, self._hidden_size]) - last_hidden = fluid.layers.concat(hidden_array, 1) - last_hidden = fluid.layers.reshape( - last_hidden, shape=[-1, self._num_layers, self._hidden_size]) - last_hidden = fluid.layers.transpose(x=last_hidden, perm=[1, 0, 2]) - last_cell = fluid.layers.concat(cell_array, 1) - last_cell = fluid.layers.reshape( - last_cell, shape=[-1, self._num_layers, self._hidden_size]) - last_cell = fluid.layers.transpose(x=last_cell, perm=[1, 0, 2]) - return real_res, last_hidden, last_cell - - -class PtbModel(fluid.Layer): - def __init__(self, - hidden_size, - vocab_size, - num_layers=2, - num_steps=20, - init_scale=0.1, - dropout=None): - super(PtbModel, self).__init__() - self.hidden_size = hidden_size - self.vocab_size = vocab_size - self.init_scale = init_scale - self.num_layers = num_layers - self.num_steps = num_steps - self.dropout = dropout - self.simple_lstm_rnn = SimpleLSTMRNN( - hidden_size, - num_steps, - num_layers=num_layers, - init_scale=init_scale, - dropout=dropout) - self.embedding = Embedding( - size=[vocab_size, hidden_size], - dtype='float32', - is_sparse=False, - param_attr=fluid.ParamAttr( - name='embedding_para', - initializer=fluid.initializer.UniformInitializer( - low=-init_scale, high=init_scale))) - self.softmax_weight = self.create_parameter( - attr=fluid.ParamAttr(), - shape=[self.hidden_size, self.vocab_size], - dtype="float32", - default_initializer=fluid.initializer.UniformInitializer( - low=-self.init_scale, high=self.init_scale)) - self.softmax_bias = self.create_parameter( - attr=fluid.ParamAttr(), - shape=[self.vocab_size], - dtype="float32", - default_initializer=fluid.initializer.UniformInitializer( - low=-self.init_scale, high=self.init_scale)) - - def build_once(self, input, label, init_hidden, init_cell): - pass - - def forward(self, input, label, init_hidden, init_cell): - - init_h = fluid.layers.reshape( - init_hidden, shape=[self.num_layers, -1, self.hidden_size]) - - init_c = fluid.layers.reshape( - init_cell, shape=[self.num_layers, -1, self.hidden_size]) - - x_emb = self.embedding(input) - - x_emb = fluid.layers.reshape( - x_emb, shape=[-1, self.num_steps, self.hidden_size]) - if self.dropout is not None and self.dropout > 0.0: - x_emb = fluid.layers.dropout( - x_emb, - dropout_prob=self.dropout, - dropout_implementation='upscale_in_train') - rnn_out, last_hidden, last_cell = self.simple_lstm_rnn(x_emb, init_h, - init_c) - - projection = fluid.layers.matmul(rnn_out, self.softmax_weight) - projection = fluid.layers.elementwise_add(projection, self.softmax_bias) - - loss = fluid.layers.softmax_with_cross_entropy( - logits=projection, label=label, soft_label=False) - loss = fluid.layers.reshape(loss, shape=[-1, self.num_steps]) - loss = fluid.layers.reduce_mean(loss, dim=[0]) - loss = fluid.layers.reduce_sum(loss) - - return loss, last_hidden, last_cell - - def debug_emb(self): - - np.save("emb_grad", self.x_emb.gradient()) - - -def train_ptb_lm(): - args = parse_args() - - # check if set use_gpu=True in paddlepaddle cpu version - model_check.check_cuda(args.use_gpu) - - place = core.CPUPlace() - if args.use_gpu: - place = fluid.CUDAPlace(0) - dev_count = fluid.core.get_cuda_device_count() - else: - place = fluid.CPUPlace() - dev_count = int(os.environ.get('CPU_NUM', multiprocessing.cpu_count())) - - # check if paddlepaddle version is satisfied - model_check.check_version() - - model_type = args.model_type - - vocab_size = 10000 - if model_type == "test": - num_layers = 1 - batch_size = 2 - hidden_size = 10 - num_steps = 3 - init_scale = 0.1 - max_grad_norm = 5.0 - epoch_start_decay = 1 - max_epoch = 1 - dropout = 0.0 - lr_decay = 0.5 - base_learning_rate = 1.0 - elif model_type == "small": - num_layers = 2 - batch_size = 20 - hidden_size = 200 - num_steps = 20 - init_scale = 0.1 - max_grad_norm = 5.0 - epoch_start_decay = 4 - max_epoch = 13 - dropout = 0.0 - lr_decay = 0.5 - base_learning_rate = 1.0 - elif model_type == "medium": - num_layers = 2 - batch_size = 20 - hidden_size = 650 - num_steps = 35 - init_scale = 0.05 - max_grad_norm = 5.0 - epoch_start_decay = 6 - max_epoch = 39 - dropout = 0.5 - lr_decay = 0.8 - base_learning_rate = 1.0 - elif model_type == "large": - num_layers = 2 - batch_size = 20 - hidden_size = 1500 - num_steps = 35 - init_scale = 0.04 - max_grad_norm = 10.0 - epoch_start_decay = 14 - max_epoch = 55 - dropout = 0.65 - lr_decay = 1.0 / 1.15 - base_learning_rate = 1.0 - else: - print("model type not support") - return - - with fluid.dygraph.guard(place): - if args.ce: - print("ce mode") - seed = 33 - np.random.seed(seed) - fluid.default_startup_program().random_seed = seed - fluid.default_main_program().random_seed = seed - max_epoch = 1 - ptb_model = PtbModel( - hidden_size=hidden_size, - vocab_size=vocab_size, - num_layers=num_layers, - num_steps=num_steps, - init_scale=init_scale, - dropout=dropout) - - if args.init_from_pretrain_model: - if not os.path.exists(args.init_from_pretrain_model + '.pdparams'): - print(args.init_from_pretrain_model) - raise Warning("The pretrained params do not exist.") - return - fluid.load_dygraph(args.init_from_pretrain_model) - print("finish initing model from pretrained params from %s" % - (args.init_from_pretrain_model)) - - dy_param_updated = dict() - dy_param_init = dict() - dy_loss = None - last_hidden = None - last_cell = None - - data_path = args.data_path - print("begin to load data") - ptb_data = reader.get_ptb_data(data_path) - print("finished load data") - train_data, valid_data, test_data = ptb_data - - batch_len = len(train_data) // batch_size - total_batch_size = (batch_len - 1) // num_steps - log_interval = 200 - - bd = [] - lr_arr = [1.0] - for i in range(1, max_epoch): - bd.append(total_batch_size * i) - new_lr = base_learning_rate * (lr_decay** - max(i + 1 - epoch_start_decay, 0.0)) - lr_arr.append(new_lr) - - grad_clip = fluid.clip.GradientClipByGlobalNorm(max_grad_norm) - sgd = SGDOptimizer( - learning_rate=fluid.layers.piecewise_decay(boundaries=bd, values=lr_arr), - parameter_list=ptb_model.parameters(), - grad_clip=grad_clip) - - def eval(model, data): - print("begin to eval") - total_loss = 0.0 - iters = 0.0 - init_hidden_data = np.zeros( - (num_layers, batch_size, hidden_size), dtype='float32') - init_cell_data = np.zeros( - (num_layers, batch_size, hidden_size), dtype='float32') - - model.eval() - train_data_iter = reader.get_data_iter(data, batch_size, num_steps) - for batch_id, batch in enumerate(train_data_iter): - x_data, y_data = batch - x_data = x_data.reshape((-1, num_steps, 1)) - y_data = y_data.reshape((-1, num_steps, 1)) - x = to_variable(x_data) - y = to_variable(y_data) - init_hidden = to_variable(init_hidden_data) - init_cell = to_variable(init_cell_data) - dy_loss, last_hidden, last_cell = ptb_model(x, y, init_hidden, - init_cell) - - out_loss = dy_loss.numpy() - - init_hidden_data = last_hidden.numpy() - init_cell_data = last_cell.numpy() - - total_loss += out_loss - iters += num_steps - - print("eval finished") - ppl = np.exp(total_loss / iters) - print("ppl ", batch_id, ppl[0]) - - ce_time = [] - ce_ppl = [] - - total_batch_num = 0 #this is for benchmark - for epoch_id in range(max_epoch): - ptb_model.train() - total_loss = 0.0 - iters = 0.0 - init_hidden_data = np.zeros( - (num_layers, batch_size, hidden_size), dtype='float32') - init_cell_data = np.zeros( - (num_layers, batch_size, hidden_size), dtype='float32') - - train_data_iter = reader.get_data_iter(train_data, batch_size, - num_steps) - init_hidden = to_variable(init_hidden_data) - init_cell = to_variable(init_cell_data) - start_time = time.time() - for batch_id, batch in enumerate(train_data_iter): - if args.max_iter and total_batch_num == args.max_iter: - return - batch_start = time.time() - x_data, y_data = batch - - x_data = x_data.reshape((-1, num_steps, 1)) - y_data = y_data.reshape((-1, num_steps, 1)) - - x = to_variable(x_data) - y = to_variable(y_data) - - dy_loss, last_hidden, last_cell = ptb_model(x, y, init_hidden, - init_cell) - init_hidden = last_hidden.detach() - init_cell = last_cell.detach() - out_loss = dy_loss.numpy() - - dy_loss.backward() - sgd.minimize(dy_loss) - - ptb_model.clear_gradients() - total_loss += out_loss - batch_end = time.time() - train_batch_cost = batch_end - batch_start - iters += num_steps - total_batch_num = total_batch_num + 1 #this is for benchmark - - if batch_id > 0 and batch_id % log_interval == 0: - ppl = np.exp(total_loss / iters) - print("-- Epoch:[%d]; Batch:[%d]; ppl: %.5f, lr: %.5f, loss: %.5f, batch cost: %.5f" % - (epoch_id, batch_id, ppl[0], - sgd._global_learning_rate().numpy(), out_loss, train_batch_cost)) - - print("one epoch finished", epoch_id) - print("time cost ", time.time() - start_time) - ppl = np.exp(total_loss / iters) - ce_time.append(time.time() - start_time) - ce_ppl.append(ppl[0]) - print("-- Epoch:[%d]; ppl: %.5f" % (epoch_id, ppl[0])) - - if batch_size <= 20 and epoch_id == 0 and ppl[0] > 1000: - # for bad init, after first epoch, the loss is over 1000 - # no more need to continue - print("Parameters are randomly initialized and not good this time because the loss is over 1000 after the first epoch.") - print("Abort this training process and please start again.") - return - - save_model_dir = os.path.join(args.save_model_dir, - str(epoch_id), 'params') - fluid.save_dygraph(ptb_model.state_dict(), save_model_dir) - print("Saved model to: %s.\n" % save_model_dir) - - eval(ptb_model, valid_data) - - if args.ce: - _ppl = 0 - _time = 0 - try: - _time = ce_time[-1] - _ppl = ce_ppl[-1] - except: - print("ce info error") - print("kpis\ttrain_duration_card%s\t%s" % (dev_count, _time)) - print("kpis\ttrain_ppl_card%s\t%f" % (dev_count, _ppl)) - - eval(ptb_model, test_data) - -train_ptb_lm() +# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import print_function + +import os +import unittest +import paddle.fluid as fluid +import paddle.fluid.core as core +from paddle.fluid.dygraph.nn import Embedding +import paddle.fluid.framework as framework +from paddle.fluid.optimizer import SGDOptimizer +from paddle.fluid.dygraph.base import to_variable +import numpy as np +import six +import multiprocessing + +import reader +import model_check +import time + +from args import * + +#import fluid.clip as clip +#from fluid.clip import * + +import sys +if sys.version[0] == '2': + reload(sys) + sys.setdefaultencoding("utf-8") + + +class SimpleLSTMRNN(fluid.Layer): + def __init__(self, + hidden_size, + num_steps, + num_layers=2, + init_scale=0.1, + dropout=None): + super(SimpleLSTMRNN, self).__init__() + self._hidden_size = hidden_size + self._num_layers = num_layers + self._init_scale = init_scale + self._dropout = dropout + self._num_steps = num_steps + self.cell_array = [] + self.hidden_array = [] + + self.weight_1_arr = [] + self.weight_2_arr = [] + self.bias_arr = [] + self.mask_array = [] + + for i in range(self._num_layers): + weight_1 = self.create_parameter( + attr=fluid.ParamAttr( + initializer=fluid.initializer.UniformInitializer( + low=-self._init_scale, high=self._init_scale)), + shape=[self._hidden_size * 2, self._hidden_size * 4], + dtype="float32", + default_initializer=fluid.initializer.UniformInitializer( + low=-self._init_scale, high=self._init_scale)) + self.weight_1_arr.append(self.add_parameter('w_%d' % i, weight_1)) + bias_1 = self.create_parameter( + attr=fluid.ParamAttr( + initializer=fluid.initializer.UniformInitializer( + low=-self._init_scale, high=self._init_scale)), + shape=[self._hidden_size * 4], + dtype="float32", + default_initializer=fluid.initializer.Constant(0.0)) + self.bias_arr.append(self.add_parameter('b_%d' % i, bias_1)) + + def forward(self, input_embedding, init_hidden=None, init_cell=None): + cell_array = [] + hidden_array = [] + + for i in range(self._num_layers): + hidden_array.append(init_hidden[i]) + cell_array.append(init_cell[i]) + + res = [] + for index in range(self._num_steps): + step_input = input_embedding[:, index, :] + for k in range(self._num_layers): + pre_hidden = hidden_array[k] + pre_cell = cell_array[k] + weight_1 = self.weight_1_arr[k] + bias = self.bias_arr[k] + + nn = fluid.layers.concat([step_input, pre_hidden], 1) + gate_input = fluid.layers.matmul(x=nn, y=weight_1) + + gate_input = fluid.layers.elementwise_add(gate_input, bias) + i, j, f, o = fluid.layers.split( + gate_input, num_or_sections=4, dim=-1) + c = pre_cell * fluid.layers.sigmoid(f) + fluid.layers.sigmoid( + i) * fluid.layers.tanh(j) + m = fluid.layers.tanh(c) * fluid.layers.sigmoid(o) + hidden_array[k] = m + cell_array[k] = c + step_input = m + + if self._dropout is not None and self._dropout > 0.0: + step_input = fluid.layers.dropout( + step_input, + dropout_prob=self._dropout, + dropout_implementation='upscale_in_train') + res.append(step_input) + real_res = fluid.layers.concat(res, 1) + real_res = fluid.layers.reshape( + real_res, [-1, self._num_steps, self._hidden_size]) + last_hidden = fluid.layers.concat(hidden_array, 1) + last_hidden = fluid.layers.reshape( + last_hidden, shape=[-1, self._num_layers, self._hidden_size]) + last_hidden = fluid.layers.transpose(x=last_hidden, perm=[1, 0, 2]) + last_cell = fluid.layers.concat(cell_array, 1) + last_cell = fluid.layers.reshape( + last_cell, shape=[-1, self._num_layers, self._hidden_size]) + last_cell = fluid.layers.transpose(x=last_cell, perm=[1, 0, 2]) + return real_res, last_hidden, last_cell + + +class PtbModel(fluid.Layer): + def __init__(self, + hidden_size, + vocab_size, + num_layers=2, + num_steps=20, + init_scale=0.1, + dropout=None): + super(PtbModel, self).__init__() + self.hidden_size = hidden_size + self.vocab_size = vocab_size + self.init_scale = init_scale + self.num_layers = num_layers + self.num_steps = num_steps + self.dropout = dropout + self.simple_lstm_rnn = SimpleLSTMRNN( + hidden_size, + num_steps, + num_layers=num_layers, + init_scale=init_scale, + dropout=dropout) + self.embedding = Embedding( + size=[vocab_size, hidden_size], + dtype='float32', + is_sparse=False, + param_attr=fluid.ParamAttr( + name='embedding_para', + initializer=fluid.initializer.UniformInitializer( + low=-init_scale, high=init_scale))) + self.softmax_weight = self.create_parameter( + attr=fluid.ParamAttr(), + shape=[self.hidden_size, self.vocab_size], + dtype="float32", + default_initializer=fluid.initializer.UniformInitializer( + low=-self.init_scale, high=self.init_scale)) + self.softmax_bias = self.create_parameter( + attr=fluid.ParamAttr(), + shape=[self.vocab_size], + dtype="float32", + default_initializer=fluid.initializer.UniformInitializer( + low=-self.init_scale, high=self.init_scale)) + + def build_once(self, input, label, init_hidden, init_cell): + pass + + def forward(self, input, label, init_hidden, init_cell): + + init_h = fluid.layers.reshape( + init_hidden, shape=[self.num_layers, -1, self.hidden_size]) + + init_c = fluid.layers.reshape( + init_cell, shape=[self.num_layers, -1, self.hidden_size]) + + x_emb = self.embedding(input) + + x_emb = fluid.layers.reshape( + x_emb, shape=[-1, self.num_steps, self.hidden_size]) + if self.dropout is not None and self.dropout > 0.0: + x_emb = fluid.layers.dropout( + x_emb, + dropout_prob=self.dropout, + dropout_implementation='upscale_in_train') + rnn_out, last_hidden, last_cell = self.simple_lstm_rnn(x_emb, init_h, + init_c) + + projection = fluid.layers.matmul(rnn_out, self.softmax_weight) + projection = fluid.layers.elementwise_add(projection, self.softmax_bias) + + loss = fluid.layers.softmax_with_cross_entropy( + logits=projection, label=label, soft_label=False) + loss = fluid.layers.reshape(loss, shape=[-1, self.num_steps]) + loss = fluid.layers.reduce_mean(loss, dim=[0]) + loss = fluid.layers.reduce_sum(loss) + + return loss, last_hidden, last_cell + + def debug_emb(self): + + np.save("emb_grad", self.x_emb.gradient()) + + +def train_ptb_lm(): + args = parse_args() + + # check if set use_gpu=True in paddlepaddle cpu version + model_check.check_cuda(args.use_gpu) + + place = core.CPUPlace() + if args.use_gpu: + place = fluid.CUDAPlace(0) + dev_count = fluid.core.get_cuda_device_count() + else: + place = fluid.CPUPlace() + dev_count = int(os.environ.get('CPU_NUM', multiprocessing.cpu_count())) + + # check if paddlepaddle version is satisfied + model_check.check_version() + + model_type = args.model_type + + vocab_size = 10000 + if model_type == "test": + num_layers = 1 + batch_size = 2 + hidden_size = 10 + num_steps = 3 + init_scale = 0.1 + max_grad_norm = 5.0 + epoch_start_decay = 1 + max_epoch = 1 + dropout = 0.0 + lr_decay = 0.5 + base_learning_rate = 1.0 + elif model_type == "small": + num_layers = 2 + batch_size = 20 + hidden_size = 200 + num_steps = 20 + init_scale = 0.1 + max_grad_norm = 5.0 + epoch_start_decay = 4 + max_epoch = 13 + dropout = 0.0 + lr_decay = 0.5 + base_learning_rate = 1.0 + elif model_type == "medium": + num_layers = 2 + batch_size = 20 + hidden_size = 650 + num_steps = 35 + init_scale = 0.05 + max_grad_norm = 5.0 + epoch_start_decay = 6 + max_epoch = 39 + dropout = 0.5 + lr_decay = 0.8 + base_learning_rate = 1.0 + elif model_type == "large": + num_layers = 2 + batch_size = 20 + hidden_size = 1500 + num_steps = 35 + init_scale = 0.04 + max_grad_norm = 10.0 + epoch_start_decay = 14 + max_epoch = 55 + dropout = 0.65 + lr_decay = 1.0 / 1.15 + base_learning_rate = 1.0 + else: + print("model type not support") + return + + with fluid.dygraph.guard(place): + if args.ce: + print("ce mode") + seed = 33 + np.random.seed(seed) + fluid.default_startup_program().random_seed = seed + fluid.default_main_program().random_seed = seed + max_epoch = 1 + ptb_model = PtbModel( + hidden_size=hidden_size, + vocab_size=vocab_size, + num_layers=num_layers, + num_steps=num_steps, + init_scale=init_scale, + dropout=dropout) + + if args.init_from_pretrain_model: + if not os.path.exists(args.init_from_pretrain_model + '.pdparams'): + print(args.init_from_pretrain_model) + raise Warning("The pretrained params do not exist.") + return + fluid.load_dygraph(args.init_from_pretrain_model) + print("finish initing model from pretrained params from %s" % + (args.init_from_pretrain_model)) + + dy_param_updated = dict() + dy_param_init = dict() + dy_loss = None + last_hidden = None + last_cell = None + + data_path = args.data_path + print("begin to load data") + ptb_data = reader.get_ptb_data(data_path) + print("finished load data") + train_data, valid_data, test_data = ptb_data + + batch_len = len(train_data) // batch_size + total_batch_size = (batch_len - 1) // num_steps + log_interval = 200 + + bd = [] + lr_arr = [1.0] + for i in range(1, max_epoch): + bd.append(total_batch_size * i) + new_lr = base_learning_rate * (lr_decay** + max(i + 1 - epoch_start_decay, 0.0)) + lr_arr.append(new_lr) + + grad_clip = fluid.clip.GradientClipByGlobalNorm(max_grad_norm) + sgd = SGDOptimizer( + learning_rate=fluid.layers.piecewise_decay( + boundaries=bd, values=lr_arr), + parameter_list=ptb_model.parameters(), + grad_clip=grad_clip) + + def reader_decorator(reader): + def __reader__(): + for item in reader: + x_data = item[0].reshape((-1, num_steps, 1)) + y_data = item[1].reshape((-1, num_steps, 1)) + yield x_data, y_data + + return __reader__ + + def eval(model, data): + print("begin to eval") + total_loss = 0.0 + iters = 0.0 + init_hidden_data = np.zeros( + (num_layers, batch_size, hidden_size), dtype='float32') + init_cell_data = np.zeros( + (num_layers, batch_size, hidden_size), dtype='float32') + + model.eval() + train_data_iter = reader_decorator( + reader.get_data_iter(data, batch_size, num_steps)) + + eval_data_loader = fluid.io.DataLoader.from_generator(capacity=200) + eval_data_loader.set_batch_generator(train_data_iter, places=place) + + for batch_id, batch in enumerate(eval_data_loader): + x, y = batch + init_hidden = to_variable(init_hidden_data) + init_cell = to_variable(init_cell_data) + dy_loss, last_hidden, last_cell = ptb_model(x, y, init_hidden, + init_cell) + + out_loss = dy_loss.numpy() + + init_hidden_data = last_hidden.numpy() + init_cell_data = last_cell.numpy() + + total_loss += out_loss + iters += num_steps + + print("eval finished") + ppl = np.exp(total_loss / iters) + print("ppl ", batch_id, ppl[0]) + + ce_time = [] + ce_ppl = [] + + total_batch_num = 0 #this is for benchmark + for epoch_id in range(max_epoch): + ptb_model.train() + total_loss = 0.0 + iters = 0.0 + init_hidden_data = np.zeros( + (num_layers, batch_size, hidden_size), dtype='float32') + init_cell_data = np.zeros( + (num_layers, batch_size, hidden_size), dtype='float32') + + train_data_iter = reader_decorator( + reader.get_data_iter(train_data, batch_size, num_steps)) + + train_data_loader = fluid.io.DataLoader.from_generator(capacity=200) + train_data_loader.set_batch_generator(train_data_iter, places=place) + + init_hidden = to_variable(init_hidden_data) + init_cell = to_variable(init_cell_data) + start_time = time.time() + for batch_id, batch in enumerate(train_data_loader): + if args.max_iter and total_batch_num == args.max_iter: + return + batch_start = time.time() + x, y = batch + + dy_loss, last_hidden, last_cell = ptb_model(x, y, init_hidden, + init_cell) + init_hidden = last_hidden.detach() + init_cell = last_cell.detach() + out_loss = dy_loss.numpy() + + dy_loss.backward() + sgd.minimize(dy_loss) + + ptb_model.clear_gradients() + total_loss += out_loss + batch_end = time.time() + train_batch_cost = batch_end - batch_start + iters += num_steps + total_batch_num = total_batch_num + 1 #this is for benchmark + + if batch_id > 0 and batch_id % log_interval == 0: + ppl = np.exp(total_loss / iters) + print("-- Epoch:[%d]; Batch:[%d]; ppl: %.5f, lr: %.5f, loss: %.5f, batch cost: %.5f" % + (epoch_id, batch_id, ppl[0], + sgd._global_learning_rate().numpy(), out_loss, train_batch_cost)) + + print("one epoch finished", epoch_id) + print("time cost ", time.time() - start_time) + ppl = np.exp(total_loss / iters) + ce_time.append(time.time() - start_time) + ce_ppl.append(ppl[0]) + print("-- Epoch:[%d]; ppl: %.5f" % (epoch_id, ppl[0])) + + if batch_size <= 20 and epoch_id == 0 and ppl[0] > 1000: + # for bad init, after first epoch, the loss is over 1000 + # no more need to continue + print( + "Parameters are randomly initialized and not good this time because the loss is over 1000 after the first epoch." + ) + print("Abort this training process and please start again.") + return + + save_model_dir = os.path.join(args.save_model_dir, + str(epoch_id), 'params') + fluid.save_dygraph(ptb_model.state_dict(), save_model_dir) + print("Saved model to: %s.\n" % save_model_dir) + + eval(ptb_model, valid_data) + + if args.ce: + _ppl = 0 + _time = 0 + try: + _time = ce_time[-1] + _ppl = ce_ppl[-1] + except: + print("ce info error") + print("kpis\ttrain_duration_card%s\t%s" % (dev_count, _time)) + print("kpis\ttrain_ppl_card%s\t%f" % (dev_count, _ppl)) + + eval(ptb_model, test_data) + + +train_ptb_lm() diff --git a/dygraph/resnet/train.py b/dygraph/resnet/train.py index e92a39bde5bce633dda9452d5c0dad3399092248..5339cadcc88954f63d482f535ad72e5305f30490 100644 --- a/dygraph/resnet/train.py +++ b/dygraph/resnet/train.py @@ -81,7 +81,6 @@ def optimizer_setting(parameter_list=None): boundaries=bd, values=lr), momentum=momentum_rate, regularization=fluid.regularizer.L2Decay(l2_decay)) - return optimizer @@ -116,11 +115,7 @@ class ConvBNLayer(fluid.dygraph.Layer): class BottleneckBlock(fluid.dygraph.Layer): - def __init__(self, - num_channels, - num_filters, - stride, - shortcut=True): + def __init__(self, num_channels, num_filters, stride, shortcut=True): super(BottleneckBlock, self).__init__() self.conv0 = ConvBNLayer( @@ -186,16 +181,9 @@ class ResNet(fluid.dygraph.Layer): num_filters = [64, 128, 256, 512] self.conv = ConvBNLayer( - num_channels=3, - num_filters=64, - filter_size=7, - stride=2, - act='relu') + num_channels=3, num_filters=64, filter_size=7, stride=2, act='relu') self.pool2d_max = Pool2D( - pool_size=3, - pool_stride=2, - pool_padding=1, - pool_type='max') + pool_size=3, pool_stride=2, pool_padding=1, pool_type='max') self.bottleneck_block_list = [] for block in range(len(depth)): @@ -220,11 +208,12 @@ class ResNet(fluid.dygraph.Layer): import math stdv = 1.0 / math.sqrt(2048 * 1.0) - self.out = Linear(self.pool2d_avg_output, - class_dim, - act='softmax', - param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv))) + self.out = Linear( + self.pool2d_avg_output, + class_dim, + act='softmax', + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv))) def forward(self, inputs): y = self.conv(inputs) @@ -237,6 +226,16 @@ class ResNet(fluid.dygraph.Layer): return y +def reader_decorator(reader): + def __reader__(): + for item in reader(): + img = np.array(item[0]).astype('float32').reshape(3, 224, 224) + label = np.array(item[1]).astype('int64').reshape(1) + yield img, label + + return __reader__ + + def eval(model, data): model.eval() @@ -245,15 +244,8 @@ def eval(model, data): total_acc5 = 0.0 total_sample = 0 for batch_id, data in enumerate(data()): - dy_x_data = np.array( - [x[0].reshape(3, 224, 224) for x in data]).astype('float32') - if len(np.array([x[1] for x in data]).astype('int64')) != batch_size: - continue - y_data = np.array([x[1] for x in data]).astype('int64').reshape( - batch_size, 1) - - img = to_variable(dy_x_data) - label = to_variable(y_data) + img = data[0] + label = data[1] label.stop_gradient = True out = model(img) @@ -303,13 +295,24 @@ def train_resnet(): resnet = fluid.dygraph.parallel.DataParallel(resnet, strategy) train_reader = paddle.batch( - paddle.dataset.flowers.train(use_xmap=False), batch_size=batch_size) + reader_decorator(paddle.dataset.flowers.train(use_xmap=True)), + batch_size=batch_size, + drop_last=True) + if args.use_data_parallel: train_reader = fluid.contrib.reader.distributed_batch_reader( train_reader) test_reader = paddle.batch( - paddle.dataset.flowers.test(use_xmap=False), batch_size=batch_size) + reader_decorator(paddle.dataset.flowers.test(use_xmap=True)), + batch_size=batch_size, + drop_last=True) + + train_loader = fluid.io.DataLoader.from_generator(capacity=10) + train_loader.set_sample_list_generator(train_reader, places=place) + + test_loader = fluid.io.DataLoader.from_generator(capacity=10) + test_loader.set_sample_list_generator(test_reader, places=place) #file_name = './model/epoch_0.npz' #model_data = np.load( file_name ) @@ -331,23 +334,13 @@ def train_resnet(): print("load finished") - for batch_id, data in enumerate(train_reader()): - + for batch_id, data in enumerate(train_loader()): #NOTE: used in benchmark if args.max_iter and total_batch_num == args.max_iter: return batch_start = time.time() - dy_x_data = np.array( - [x[0].reshape(3, 224, 224) for x in data]).astype('float32') - if len(np.array([x[1] - for x in data]).astype('int64')) != batch_size: - continue - y_data = np.array([x[1] for x in data]).astype('int64').reshape( - -1, 1) - - img = to_variable(dy_x_data) - label = to_variable(y_data) + img, label = data label.stop_gradient = True out = resnet(img) @@ -390,16 +383,14 @@ def train_resnet(): (eop, batch_id, total_loss / total_sample, \ total_acc1 / total_sample, total_acc5 / total_sample)) resnet.eval() - eval(resnet, test_reader) + eval(resnet, test_loader) save_parameters = (not args.use_data_parallel) or ( args.use_data_parallel and fluid.dygraph.parallel.Env().local_rank == 0) if save_parameters: - fluid.save_dygraph(resnet.state_dict(), - 'resnet_params') + fluid.save_dygraph(resnet.state_dict(), 'resnet_params') if __name__ == '__main__': - train_resnet() diff --git a/dygraph/se_resnet/train.py b/dygraph/se_resnet/train.py index 67b9dacf2e07e19e07d466683769641830a6fd36..0ba5de46f83dfcd3f3e820ce11aedc9da88925ff 100644 --- a/dygraph/se_resnet/train.py +++ b/dygraph/se_resnet/train.py @@ -169,8 +169,7 @@ class BottleneckBlock(fluid.dygraph.Layer): act=None) self.scale = SqueezeExcitation( - num_channels=num_filters * 2, - reduction_ratio=reduction_ratio) + num_channels=num_filters * 2, reduction_ratio=reduction_ratio) if not shortcut: self.short = ConvBNLayer( @@ -219,10 +218,7 @@ class SeResNeXt(fluid.dygraph.Layer): stride=2, act='relu') self.pool = Pool2D( - pool_size=3, - pool_stride=2, - pool_padding=1, - pool_type='max') + pool_size=3, pool_stride=2, pool_padding=1, pool_type='max') elif layers == 101: cardinality = 32 reduction_ratio = 16 @@ -235,10 +231,7 @@ class SeResNeXt(fluid.dygraph.Layer): stride=2, act='relu') self.pool = Pool2D( - pool_size=3, - pool_stride=2, - pool_padding=1, - pool_type='max') + pool_size=3, pool_stride=2, pool_padding=1, pool_type='max') elif layers == 152: cardinality = 64 reduction_ratio = 16 @@ -263,10 +256,7 @@ class SeResNeXt(fluid.dygraph.Layer): stride=1, act='relu') self.pool = Pool2D( - pool_size=3, - pool_stride=2, - pool_padding=1, - pool_type='max') + pool_size=3, pool_stride=2, pool_padding=1, pool_type='max') self.bottleneck_block_list = [] num_channels = 64 @@ -294,10 +284,11 @@ class SeResNeXt(fluid.dygraph.Layer): self.pool2d_avg_output = num_filters[len(num_filters) - 1] * 2 * 1 * 1 - self.out = Linear(self.pool2d_avg_output, - class_dim, - param_attr=fluid.param_attr.ParamAttr( - initializer=fluid.initializer.Uniform(-stdv, stdv))) + self.out = Linear( + self.pool2d_avg_output, + class_dim, + param_attr=fluid.param_attr.ParamAttr( + initializer=fluid.initializer.Uniform(-stdv, stdv))) def forward(self, inputs): if self.layers == 50 or self.layers == 101: @@ -318,6 +309,16 @@ class SeResNeXt(fluid.dygraph.Layer): return y +def reader_decorator(reader): + def __reader__(): + for item in reader(): + img = np.array(item[0]).astype('float32').reshape(3, 224, 224) + label = np.array(item[1]).astype('int64').reshape(1) + yield img, label + + return __reader__ + + def eval(model, data): model.eval() @@ -327,15 +328,7 @@ def eval(model, data): total_acc5 = 0.0 total_sample = 0 for batch_id, data in enumerate(data()): - dy_x_data = np.array( - [x[0].reshape(3, 224, 224) for x in data]).astype('float32') - if len(np.array([x[1] for x in data]).astype('int64')) != batch_size: - continue - y_data = np.array([x[1] for x in data]).astype('int64').reshape( - batch_size, 1) - - img = to_variable(dy_x_data) - label = to_variable(y_data) + img, label = data label.stop_gradient = True out = model(img) @@ -389,29 +382,29 @@ def train(): se_resnext = fluid.dygraph.parallel.DataParallel(se_resnext, strategy) train_reader = paddle.batch( - paddle.dataset.flowers.train(use_xmap=False), + reader_decorator(paddle.dataset.flowers.train(use_xmap=False)), batch_size=batch_size, drop_last=True) if args.use_data_parallel: train_reader = fluid.contrib.reader.distributed_batch_reader( train_reader) test_reader = paddle.batch( - paddle.dataset.flowers.test(use_xmap=False), batch_size=32) + reader_decorator(paddle.dataset.flowers.test(use_xmap=False)), + batch_size=32) + + train_loader = fluid.io.DataLoader.from_generator(capacity=10) + train_loader.set_sample_list_generator(train_reader, places=place) + + test_loader = fluid.io.DataLoader.from_generator(capacity=10) + test_loader.set_sample_list_generator(test_reader, places=place) for epoch_id in range(epoch_num): total_loss = 0.0 total_acc1 = 0.0 total_acc5 = 0.0 total_sample = 0 - for batch_id, data in enumerate(train_reader()): - - dy_x_data = np.array([x[0].reshape(3, 224, 224) - for x in data]).astype('float32') - y_data = np.array([x[1] for x in data]).astype('int64').reshape( - batch_size, 1) - - img = to_variable(dy_x_data) - label = to_variable(y_data) + for batch_id, data in enumerate(train_loader()): + img, label = data label.stop_gradient = True out = se_resnext(img) @@ -454,7 +447,7 @@ def train(): (epoch_id, batch_id, total_loss / total_sample, \ total_acc1 / total_sample, total_acc5 / total_sample)) se_resnext.eval() - eval(se_resnext, test_reader) + eval(se_resnext, test_loader) se_resnext.train()