提交 05403680 编写于 作者: G guosheng

Merge branch 'develop' of https://github.com/PaddlePaddle/models into cloudtest2

[submodule "fluid/SimNet"]
path = fluid/SimNet
url = https://github.com/baidu/AnyQ.git
[submodule "fluid/LAC"]
path = fluid/LAC
url = https://github.com/baidu/lac
[submodule "fluid/Senta"]
path = fluid/Senta
url = https://github.com/baidu/Senta
...@@ -8,7 +8,7 @@ PaddlePaddle provides a rich set of computational units to enable users to adopt ...@@ -8,7 +8,7 @@ PaddlePaddle provides a rich set of computational units to enable users to adopt
- [fluid models](fluid): use PaddlePaddle's Fluid APIs. We especially recommend users to use Fluid models. - [fluid models](fluid): use PaddlePaddle's Fluid APIs. We especially recommend users to use Fluid models.
- [v2 models](v2): use PaddlePaddle's v2 APIs. - [legacy models](legacy): use PaddlePaddle's v2 APIs.
## License ## License
......
...@@ -2,7 +2,6 @@ from __future__ import absolute_import ...@@ -2,7 +2,6 @@ from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
import paddle.v2 as paddle
import paddle.fluid as fluid import paddle.fluid as fluid
......
Subproject commit 66660503bb6e8f34adc4715ccf42cad77ed46ded
...@@ -49,14 +49,46 @@ Network,ICNet)进行语义分割,相比其他分割算法,ICNet兼顾了准 ...@@ -49,14 +49,46 @@ Network,ICNet)进行语义分割,相比其他分割算法,ICNet兼顾了准
- `ICNet <https://github.com/PaddlePaddle/models/tree/develop/fluid/icnet>`__ - `ICNet <https://github.com/PaddlePaddle/models/tree/develop/fluid/icnet>`__
图像生成
-----------
图像生成是指根据输入向量,生成目标图像。这里的输入向量可以是随机的噪声或用户指定的条件向量。具体的应用场景有:手写体生成、人脸合成、风格迁移、图像修复等。当前的图像生成任务主要是借助生成对抗网络(GAN)来实现。
生成对抗网络(GAN)由两种子网络组成:生成器和识别器。生成器的输入是随机噪声或条件向量,输出是目标图像。识别器是一个分类器,输入是一张图像,输出是该图像是否是真实的图像。在训练过程中,生成器和识别器通过不断的相互博弈提升自己的能力。
在图像生成任务中,我们介绍了如何使用DCGAN和ConditioanlGAN来进行手写数字的生成,另外还介绍了用于风格迁移的CycleGAN.
- `DCGAN & ConditionalGAN <https://github.com/PaddlePaddle/models/tree/develop/fluid/gan/c_gan>`__
- `CycleGAN <https://github.com/PaddlePaddle/models/tree/develop/fluid/gan/cycle_gan>`__
场景文字识别 场景文字识别
------------ ------------
许多场景图像中包含着丰富的文本信息,对理解图像信息有着重要作用,能够极大地帮助人们认知和理解场景图像的内容。场景文字识别是在图像背景复杂、分辨率低下、字体多样、分布随意等情况下,将图像信息转化为文字序列的过程,可认为是一种特别的翻译过程:将图像输入翻译为自然语言输出。场景图像文字识别技术的发展也促进了一些新型应用的产生,如通过自动识别路牌中的文字帮助街景应用获取更加准确的地址信息等。 许多场景图像中包含着丰富的文本信息,对理解图像信息有着重要作用,能够极大地帮助人们认知和理解场景图像的内容。场景文字识别是在图像背景复杂、分辨率低下、字体多样、分布随意等情况下,将图像信息转化为文字序列的过程,可认为是一种特别的翻译过程:将图像输入翻译为自然语言输出。场景图像文字识别技术的发展也促进了一些新型应用的产生,如通过自动识别路牌中的文字帮助街景应用获取更加准确的地址信息等。
在场景文字识别任务中,我们介绍如何将基于CNN的图像特征提取和基于RNN的序列翻译技术结合,免除人工定义特征,避免字符分割,使用自动学习到的图像特征,完成端到端地无约束字符定位和识别。当前,介绍了CRNN-CTC模型,后续会引入基于注意力机制的序列到序列模型。 在场景文字识别任务中,我们介绍如何将基于CNN的图像特征提取和基于RNN的序列翻译技术结合,免除人工定义特征,避免字符分割,使用自动学习到的图像特征,完成字符识别。当前,介绍了CRNN-CTC模型和基于注意力机制的序列到序列模型。
- `CRNN-CTC模型 <https://github.com/PaddlePaddle/models/tree/develop/fluid/ocr_recognition>`__ - `CRNN-CTC模型 <https://github.com/PaddlePaddle/models/tree/develop/fluid/ocr_recognition>`__
- `Attention模型 <https://github.com/PaddlePaddle/models/tree/develop/fluid/ocr_recognition>`__
度量学习
-------
度量学习也称作距离度量学习、相似度学习,通过学习对象之间的距离,度量学习能够用于分析对象时间的关联、比较关系,在实际问题中应用较为广泛,可应用于辅助分类、聚类问题,也广泛用于图像检索、人脸识别等领域。以往,针对不同的任务,需要选择合适的特征并手动构建距离函数,而度量学习可根据不同的任务来自主学习出针对特定任务的度量距离函数。度量学习和深度学习的结合,在人脸识别/验证、行人再识别(human Re-ID)、图像检索等领域均取得较好的性能,在这个任务中我们主要介绍了基于Fluid的深度度量学习模型,包含了三元组、四元组等损失函数。
- `Metric Learning <https://github.com/PaddlePaddle/models/tree/develop/fluid/metric_learning>`__
视频分类
-------
视频分类是视频理解任务的基础,与图像分类不同的是,分类的对象不再是静止的图像,而是一个由多帧图像构成的、包含语音数据、包含运动信息等的视频对象,因此理解视频需要获得更多的上下文信息,不仅要理解每帧图像是什么、包含什么,还需要结合不同帧,知道上下文的关联信息。视频分类方法主要包含基于卷积神经网络、基于循环神经网络、或将这两者结合的方法。该任务中我们介绍基于Fluid的视频分类模型,目前包含Temporal Segment Network(TSN)模型,后续会持续增加更多模型。
- `TSN <https://github.com/PaddlePaddle/models/tree/develop/fluid/video_classification>`__
语音识别 语音识别
-------- --------
...@@ -124,6 +156,15 @@ DQN 及其变体,并测试了它们在 Atari 游戏中的表现。 ...@@ -124,6 +156,15 @@ DQN 及其变体,并测试了它们在 Atari 游戏中的表现。
- `Senta <https://github.com/baidu/Senta/blob/master/README.md>`__ - `Senta <https://github.com/baidu/Senta/blob/master/README.md>`__
语义匹配
--------
在自然语言处理很多场景中,需要度量两个文本在语义上的相似度,这类任务通常被称为语义匹配。例如在搜索中根据查询与候选文档的相似度对搜索结果进行排序,文本去重中文本与文本相似度的计算,自动问答中候选答案与问题的匹配等。
本例所开放的DAM (Deep Attention Matching Network)为百度自然语言处理部发表于ACL-2018的工作,用于检索式聊天机器人多轮对话中应答的选择。DAM受Transformer的启发,其网络结构完全基于注意力(attention)机制,利用栈式的self-attention结构分别学习不同粒度下应答和语境的语义表示,然后利用cross-attention获取应答与语境之间的相关性,在两个大规模多轮对话数据集上的表现均好于其它模型。
- `Deep Attention Matching Network <https://github.com/PaddlePaddle/models/tree/develop/fluid/deep_attention_matching_net>`__
AnyQ AnyQ
---- ----
...@@ -135,3 +176,12 @@ SimNet是百度自然语言处理部于2013年自主研发的语义匹配框架 ...@@ -135,3 +176,12 @@ SimNet是百度自然语言处理部于2013年自主研发的语义匹配框架
- `SimNet in PaddlePaddle - `SimNet in PaddlePaddle
Fluid <https://github.com/baidu/AnyQ/blob/master/tools/simnet/train/paddle/README.md>`__ Fluid <https://github.com/baidu/AnyQ/blob/master/tools/simnet/train/paddle/README.md>`__
机器阅读理解
----
机器阅读理解(MRC)是自然语言处理(NLP)中的核心任务之一,最终目标是让机器像人类一样阅读文本,提炼文本信息并回答相关问题。深度学习近年来在NLP中得到广泛使用,也使得机器阅读理解能力在近年有了大幅提高,但是目前研究的机器阅读理解都采用人工构造的数据集,以及回答一些相对简单的问题,和人类处理的数据还有明显差距,因此亟需大规模真实训练数据推动MRC的进一步发展。
百度阅读理解数据集是由百度自然语言处理部开源的一个真实世界数据集,所有的问题、原文都来源于实际数据(百度搜索引擎数据和百度知道问答社区),答案是由人类回答的。每个问题都对应多个答案,数据集包含200k问题、1000k原文和420k答案,是目前最大的中文MRC数据集。百度同时开源了对应的阅读理解模型,称为DuReader,采用当前通用的网络分层结构,通过双向attention机制捕捉问题和原文之间的交互关系,生成query-aware的原文表示,最终基于query-aware的原文表示通过point network预测答案范围。
- `DuReader in PaddlePaddle Fluid] <https://github.com/PaddlePaddle/models/blob/develop/fluid/machine_reading_comprehension/README.md>`__
...@@ -28,8 +28,11 @@ Fluid模型配置和参数文件的工具。 ...@@ -28,8 +28,11 @@ Fluid模型配置和参数文件的工具。
开放环境中的检测人脸,尤其是小的、模糊的和部分遮挡的人脸也是一个具有挑战的任务。我们也介绍了如何基于 [WIDER FACE](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace) 数据训练百度自研的人脸检测PyramidBox模型,该算法于2018年3月份在WIDER FACE的多项评测中均获得 [第一名](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/WiderFace_Results.html) 开放环境中的检测人脸,尤其是小的、模糊的和部分遮挡的人脸也是一个具有挑战的任务。我们也介绍了如何基于 [WIDER FACE](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace) 数据训练百度自研的人脸检测PyramidBox模型,该算法于2018年3月份在WIDER FACE的多项评测中均获得 [第一名](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/WiderFace_Results.html)
Faster RCNN 是典型的两阶段目标检测器,相较于传统提取区域的方法,Faster RCNN中RPN网络通过共享卷积层参数大幅提高提取区域的效率,并提出高质量的候选区域。
- [Single Shot MultiBox Detector](https://github.com/PaddlePaddle/models/blob/develop/fluid/object_detection/README_cn.md) - [Single Shot MultiBox Detector](https://github.com/PaddlePaddle/models/blob/develop/fluid/object_detection/README_cn.md)
- [Face Detector: PyramidBox](https://github.com/PaddlePaddle/models/tree/develop/fluid/face_detection/README_cn.md) - [Face Detector: PyramidBox](https://github.com/PaddlePaddle/models/tree/develop/fluid/face_detection/README_cn.md)
- [Faster RCNN](https://github.com/PaddlePaddle/models/tree/develop/fluid/faster_rcnn/README_cn.md)
图像语义分割 图像语义分割
------------ ------------
...@@ -41,14 +44,45 @@ Network,ICNet)进行语义分割,相比其他分割算法,ICNet兼顾了准 ...@@ -41,14 +44,45 @@ Network,ICNet)进行语义分割,相比其他分割算法,ICNet兼顾了准
- [ICNet](https://github.com/PaddlePaddle/models/tree/develop/fluid/icnet) - [ICNet](https://github.com/PaddlePaddle/models/tree/develop/fluid/icnet)
图像生成
-----------
图像生成是指根据输入向量,生成目标图像。这里的输入向量可以是随机的噪声或用户指定的条件向量。具体的应用场景有:手写体生成、人脸合成、风格迁移、图像修复等。当前的图像生成任务主要是借助生成对抗网络(GAN)来实现。
生成对抗网络(GAN)由两种子网络组成:生成器和识别器。生成器的输入是随机噪声或条件向量,输出是目标图像。识别器是一个分类器,输入是一张图像,输出是该图像是否是真实的图像。在训练过程中,生成器和识别器通过不断的相互博弈提升自己的能力。
在图像生成任务中,我们介绍了如何使用DCGAN和ConditioanlGAN来进行手写数字的生成,另外还介绍了用于风格迁移的CycleGAN.
- [DCGAN & ConditionalGAN](https://github.com/PaddlePaddle/models/tree/develop/fluid/gan/c_gan)
- [CycleGAN](https://github.com/PaddlePaddle/models/tree/develop/fluid/gan/cycle_gan)
场景文字识别 场景文字识别
------------ ------------
许多场景图像中包含着丰富的文本信息,对理解图像信息有着重要作用,能够极大地帮助人们认知和理解场景图像的内容。场景文字识别是在图像背景复杂、分辨率低下、字体多样、分布随意等情况下,将图像信息转化为文字序列的过程,可认为是一种特别的翻译过程:将图像输入翻译为自然语言输出。场景图像文字识别技术的发展也促进了一些新型应用的产生,如通过自动识别路牌中的文字帮助街景应用获取更加准确的地址信息等。 许多场景图像中包含着丰富的文本信息,对理解图像信息有着重要作用,能够极大地帮助人们认知和理解场景图像的内容。场景文字识别是在图像背景复杂、分辨率低下、字体多样、分布随意等情况下,将图像信息转化为文字序列的过程,可认为是一种特别的翻译过程:将图像输入翻译为自然语言输出。场景图像文字识别技术的发展也促进了一些新型应用的产生,如通过自动识别路牌中的文字帮助街景应用获取更加准确的地址信息等。
在场景文字识别任务中,我们介绍如何将基于CNN的图像特征提取和基于RNN的序列翻译技术结合,免除人工定义特征,避免字符分割,使用自动学习到的图像特征,完成端到端地无约束字符定位和识别。当前,介绍了CRNN-CTC模型,后续会引入基于注意力机制的序列到序列模型。 在场景文字识别任务中,我们介绍如何将基于CNN的图像特征提取和基于RNN的序列翻译技术结合,免除人工定义特征,避免字符分割,使用自动学习到的图像特征,完成字符识别。当前,介绍了CRNN-CTC模型和基于注意力机制的序列到序列模型。
- [CRNN-CTC模型](https://github.com/PaddlePaddle/models/tree/develop/fluid/ocr_recognition)
- [Attention模型](https://github.com/PaddlePaddle/models/tree/develop/fluid/ocr_recognition)
度量学习
-------
度量学习也称作距离度量学习、相似度学习,通过学习对象之间的距离,度量学习能够用于分析对象时间的关联、比较关系,在实际问题中应用较为广泛,可应用于辅助分类、聚类问题,也广泛用于图像检索、人脸识别等领域。以往,针对不同的任务,需要选择合适的特征并手动构建距离函数,而度量学习可根据不同的任务来自主学习出针对特定任务的度量距离函数。度量学习和深度学习的结合,在人脸识别/验证、行人再识别(human Re-ID)、图像检索等领域均取得较好的性能,在这个任务中我们主要介绍了基于Fluid的深度度量学习模型,包含了三元组、四元组等损失函数。
- [Metric Learning](https://github.com/PaddlePaddle/models/tree/develop/fluid/metric_learning)
视频分类
-------
视频分类是视频理解任务的基础,与图像分类不同的是,分类的对象不再是静止的图像,而是一个由多帧图像构成的、包含语音数据、包含运动信息等的视频对象,因此理解视频需要获得更多的上下文信息,不仅要理解每帧图像是什么、包含什么,还需要结合不同帧,知道上下文的关联信息。视频分类方法主要包含基于卷积神经网络、基于循环神经网络、或将这两者结合的方法。该任务中我们介绍基于Fluid的视频分类模型,目前包含Temporal Segment Network(TSN)模型,后续会持续增加更多模型。
- [TSN](https://github.com/PaddlePaddle/models/tree/develop/fluid/video_classification)
- [CRNN-CTC模](https://github.com/PaddlePaddle/models/tree/develop/fluid/ocr_recognition)
语音识别 语音识别
-------- --------
...@@ -94,6 +128,15 @@ Machine Translation, NMT)等阶段。在 NMT 成熟后,机器翻译才真正 ...@@ -94,6 +128,15 @@ Machine Translation, NMT)等阶段。在 NMT 成熟后,机器翻译才真正
- [Senta](https://github.com/baidu/Senta/blob/master/README.md) - [Senta](https://github.com/baidu/Senta/blob/master/README.md)
语义匹配
--------
在自然语言处理很多场景中,需要度量两个文本在语义上的相似度,这类任务通常被称为语义匹配。例如在搜索中根据查询与候选文档的相似度对搜索结果进行排序,文本去重中文本与文本相似度的计算,自动问答中候选答案与问题的匹配等。
本例所开放的DAM (Deep Attention Matching Network)为百度自然语言处理部发表于ACL-2018的工作,用于检索式聊天机器人多轮对话中应答的选择。DAM受Transformer的启发,其网络结构完全基于注意力(attention)机制,利用栈式的self-attention结构分别学习不同粒度下应答和语境的语义表示,然后利用cross-attention获取应答与语境之间的相关性,在两个大规模多轮对话数据集上的表现均好于其它模型。
- [Deep Attention Matching Network](https://github.com/PaddlePaddle/models/tree/develop/fluid/deep_attention_matching_net)
AnyQ AnyQ
---- ----
...@@ -102,3 +145,12 @@ AnyQ ...@@ -102,3 +145,12 @@ AnyQ
SimNet是百度自然语言处理部于2013年自主研发的语义匹配框架,该框架在百度各产品上广泛应用,主要包括BOW、CNN、RNN、MM-DNN等核心网络结构形式,同时基于该框架也集成了学术界主流的语义匹配模型,如MatchPyramid、MV-LSTM、K-NRM等模型。使用SimNet构建出的模型可以便捷的加入AnyQ系统中,增强AnyQ系统的语义匹配能力。 SimNet是百度自然语言处理部于2013年自主研发的语义匹配框架,该框架在百度各产品上广泛应用,主要包括BOW、CNN、RNN、MM-DNN等核心网络结构形式,同时基于该框架也集成了学术界主流的语义匹配模型,如MatchPyramid、MV-LSTM、K-NRM等模型。使用SimNet构建出的模型可以便捷的加入AnyQ系统中,增强AnyQ系统的语义匹配能力。
- [SimNet in PaddlePaddle Fluid](https://github.com/baidu/AnyQ/blob/master/tools/simnet/train/paddle/README.md) - [SimNet in PaddlePaddle Fluid](https://github.com/baidu/AnyQ/blob/master/tools/simnet/train/paddle/README.md)
机器阅读理解
----------
机器阅读理解(MRC)是自然语言处理(NLP)中的核心任务之一,最终目标是让机器像人类一样阅读文本,提炼文本信息并回答相关问题。深度学习近年来在NLP中得到广泛使用,也使得机器阅读理解能力在近年有了大幅提高,但是目前研究的机器阅读理解都采用人工构造的数据集,以及回答一些相对简单的问题,和人类处理的数据还有明显差距,因此亟需大规模真实训练数据推动MRC的进一步发展。
百度阅读理解数据集是由百度自然语言处理部开源的一个真实世界数据集,所有的问题、原文都来源于实际数据(百度搜索引擎数据和百度知道问答社区),答案是由人类回答的。每个问题都对应多个答案,数据集包含200k问题、1000k原文和420k答案,是目前最大的中文MRC数据集。百度同时开源了对应的阅读理解模型,称为DuReader,采用当前通用的网络分层结构,通过双向attention机制捕捉问题和原文之间的交互关系,生成query-aware的原文表示,最终基于query-aware的原文表示通过point network预测答案范围。
- [DuReader in PaddlePaddle Fluid](https://github.com/PaddlePaddle/models/blob/develop/fluid/machine_reading_comprehension/README.md)
Subproject commit 870651e257750f2c237f0b0bc9a27e5d062d1909
Subproject commit 4dbe7f7b0e76c188eb7f448d104f0165f0a12229
""" """
CNN on mnist data using fluid api of paddlepaddle CNN on mnist data using fluid api of paddlepaddle
""" """
import paddle.v2 as paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
......
...@@ -8,7 +8,7 @@ sys.path.append("..") ...@@ -8,7 +8,7 @@ sys.path.append("..")
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import paddle.fluid as fluid import paddle.fluid as fluid
import paddle.v2 as paddle import paddle
from advbox.adversary import Adversary from advbox.adversary import Adversary
from advbox.attacks.gradient_method import BIM from advbox.attacks.gradient_method import BIM
......
...@@ -8,7 +8,7 @@ sys.path.append("..") ...@@ -8,7 +8,7 @@ sys.path.append("..")
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import paddle.fluid as fluid import paddle.fluid as fluid
import paddle.v2 as paddle import paddle
from advbox.adversary import Adversary from advbox.adversary import Adversary
from advbox.attacks.deepfool import DeepFoolAttack from advbox.attacks.deepfool import DeepFoolAttack
......
...@@ -8,7 +8,7 @@ sys.path.append("..") ...@@ -8,7 +8,7 @@ sys.path.append("..")
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import numpy as np import numpy as np
import paddle.fluid as fluid import paddle.fluid as fluid
import paddle.v2 as paddle import paddle
from advbox.adversary import Adversary from advbox.adversary import Adversary
from advbox.attacks.gradient_method import FGSM from advbox.attacks.gradient_method import FGSM
......
...@@ -7,7 +7,7 @@ sys.path.append("..") ...@@ -7,7 +7,7 @@ sys.path.append("..")
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import paddle.fluid as fluid import paddle.fluid as fluid
import paddle.v2 as paddle import paddle
from advbox.adversary import Adversary from advbox.adversary import Adversary
from advbox.attacks.gradient_method import ILCM from advbox.attacks.gradient_method import ILCM
......
...@@ -7,7 +7,7 @@ sys.path.append("..") ...@@ -7,7 +7,7 @@ sys.path.append("..")
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import paddle.fluid as fluid import paddle.fluid as fluid
import paddle.v2 as paddle import paddle
from advbox.adversary import Adversary from advbox.adversary import Adversary
from advbox.attacks.saliency import JSMA from advbox.attacks.saliency import JSMA
......
...@@ -7,7 +7,7 @@ sys.path.append("..") ...@@ -7,7 +7,7 @@ sys.path.append("..")
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import paddle.fluid as fluid import paddle.fluid as fluid
import paddle.v2 as paddle import paddle
from advbox.adversary import Adversary from advbox.adversary import Adversary
from advbox.attacks.lbfgs import LBFGS from advbox.attacks.lbfgs import LBFGS
......
...@@ -9,7 +9,7 @@ sys.path.append("..") ...@@ -9,7 +9,7 @@ sys.path.append("..")
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import numpy as np import numpy as np
import paddle.fluid as fluid import paddle.fluid as fluid
import paddle.v2 as paddle import paddle
from advbox.adversary import Adversary from advbox.adversary import Adversary
from advbox.attacks.gradient_method import MIFGSM from advbox.attacks.gradient_method import MIFGSM
......
import cPickle as pickle import six
import numpy as np import numpy as np
import paddle.fluid as fluid import paddle.fluid as fluid
import utils.layers as layers import utils.layers as layers
...@@ -29,7 +29,7 @@ class Net(object): ...@@ -29,7 +29,7 @@ class Net(object):
mask_cache = dict() if self.use_mask_cache else None mask_cache = dict() if self.use_mask_cache else None
turns_data = [] turns_data = []
for i in xrange(self._max_turn_num): for i in six.moves.xrange(self._max_turn_num):
turn = fluid.layers.data( turn = fluid.layers.data(
name="turn_%d" % i, name="turn_%d" % i,
shape=[self._max_turn_len, 1], shape=[self._max_turn_len, 1],
...@@ -37,7 +37,7 @@ class Net(object): ...@@ -37,7 +37,7 @@ class Net(object):
turns_data.append(turn) turns_data.append(turn)
turns_mask = [] turns_mask = []
for i in xrange(self._max_turn_num): for i in six.moves.xrange(self._max_turn_num):
turn_mask = fluid.layers.data( turn_mask = fluid.layers.data(
name="turn_mask_%d" % i, name="turn_mask_%d" % i,
shape=[self._max_turn_len, 1], shape=[self._max_turn_len, 1],
...@@ -64,7 +64,7 @@ class Net(object): ...@@ -64,7 +64,7 @@ class Net(object):
Hr = response_emb Hr = response_emb
Hr_stack = [Hr] Hr_stack = [Hr]
for index in range(self._stack_num): for index in six.moves.xrange(self._stack_num):
Hr = layers.block( Hr = layers.block(
name="response_self_stack" + str(index), name="response_self_stack" + str(index),
query=Hr, query=Hr,
...@@ -78,7 +78,7 @@ class Net(object): ...@@ -78,7 +78,7 @@ class Net(object):
# context part # context part
sim_turns = [] sim_turns = []
for t in xrange(self._max_turn_num): for t in six.moves.xrange(self._max_turn_num):
Hu = fluid.layers.embedding( Hu = fluid.layers.embedding(
input=turns_data[t], input=turns_data[t],
size=[self._vocab_size + 1, self._emb_size], size=[self._vocab_size + 1, self._emb_size],
...@@ -88,7 +88,7 @@ class Net(object): ...@@ -88,7 +88,7 @@ class Net(object):
initializer=fluid.initializer.Normal(scale=0.1))) initializer=fluid.initializer.Normal(scale=0.1)))
Hu_stack = [Hu] Hu_stack = [Hu]
for index in range(self._stack_num): for index in six.moves.xrange(self._stack_num):
# share parameters # share parameters
Hu = layers.block( Hu = layers.block(
name="turn_self_stack" + str(index), name="turn_self_stack" + str(index),
...@@ -104,7 +104,7 @@ class Net(object): ...@@ -104,7 +104,7 @@ class Net(object):
# cross attention # cross attention
r_a_t_stack = [] r_a_t_stack = []
t_a_r_stack = [] t_a_r_stack = []
for index in range(self._stack_num + 1): for index in six.moves.xrange(self._stack_num + 1):
t_a_r = layers.block( t_a_r = layers.block(
name="t_attend_r_" + str(index), name="t_attend_r_" + str(index),
query=Hu_stack[index], query=Hu_stack[index],
...@@ -134,7 +134,7 @@ class Net(object): ...@@ -134,7 +134,7 @@ class Net(object):
t_a_r = fluid.layers.stack(t_a_r_stack, axis=1) t_a_r = fluid.layers.stack(t_a_r_stack, axis=1)
r_a_t = fluid.layers.stack(r_a_t_stack, axis=1) r_a_t = fluid.layers.stack(r_a_t_stack, axis=1)
else: else:
for index in xrange(len(t_a_r_stack)): for index in six.moves.xrange(len(t_a_r_stack)):
t_a_r_stack[index] = fluid.layers.unsqueeze( t_a_r_stack[index] = fluid.layers.unsqueeze(
input=t_a_r_stack[index], axes=[1]) input=t_a_r_stack[index], axes=[1])
r_a_t_stack[index] = fluid.layers.unsqueeze( r_a_t_stack[index] = fluid.layers.unsqueeze(
...@@ -151,7 +151,7 @@ class Net(object): ...@@ -151,7 +151,7 @@ class Net(object):
if self.use_stack_op: if self.use_stack_op:
sim = fluid.layers.stack(sim_turns, axis=2) sim = fluid.layers.stack(sim_turns, axis=2)
else: else:
for index in xrange(len(sim_turns)): for index in six.moves.xrange(len(sim_turns)):
sim_turns[index] = fluid.layers.unsqueeze( sim_turns[index] = fluid.layers.unsqueeze(
input=sim_turns[index], axes=[2]) input=sim_turns[index], axes=[2])
# sim shape: [batch_size, 2*(stack_num+1), max_turn_num, max_turn_len, max_turn_len] # sim shape: [batch_size, 2*(stack_num+1), max_turn_num, max_turn_len, max_turn_len]
......
import os import os
import six
import numpy as np import numpy as np
import time import time
import argparse import argparse
...@@ -6,8 +7,12 @@ import multiprocessing ...@@ -6,8 +7,12 @@ import multiprocessing
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import utils.reader as reader import utils.reader as reader
import cPickle as pickle from utils.util import print_arguments, mkdir
from utils.util import print_arguments
try:
import cPickle as pickle #python 2
except ImportError as e:
import pickle #python 3
from model import Net from model import Net
...@@ -107,7 +112,7 @@ def parse_args(): ...@@ -107,7 +112,7 @@ def parse_args():
def test(args): def test(args):
if not os.path.exists(args.save_path): if not os.path.exists(args.save_path):
raise ValueError("Invalid save path %s" % args.save_path) mkdir(args.save_path)
if not os.path.exists(args.model_path): if not os.path.exists(args.model_path):
raise ValueError("Invalid model init path %s" % args.model_path) raise ValueError("Invalid model init path %s" % args.model_path)
# data data_config # data data_config
...@@ -158,7 +163,11 @@ def test(args): ...@@ -158,7 +163,11 @@ def test(args):
use_cuda=args.use_cuda, main_program=test_program) use_cuda=args.use_cuda, main_program=test_program)
print("start loading data ...") print("start loading data ...")
train_data, val_data, test_data = pickle.load(open(args.data_path, 'rb')) with open(args.data_path, 'rb') as f:
if six.PY2:
train_data, val_data, test_data = pickle.load(f)
else:
train_data, val_data, test_data = pickle.load(f, encoding="bytes")
print("finish loading data ...") print("finish loading data ...")
if args.ext_eval: if args.ext_eval:
...@@ -178,9 +187,9 @@ def test(args): ...@@ -178,9 +187,9 @@ def test(args):
score_path = os.path.join(args.save_path, 'score.txt') score_path = os.path.join(args.save_path, 'score.txt')
score_file = open(score_path, 'w') score_file = open(score_path, 'w')
for it in xrange(test_batch_num // dev_count): for it in six.moves.xrange(test_batch_num // dev_count):
feed_list = [] feed_list = []
for dev in xrange(dev_count): for dev in six.moves.xrange(dev_count):
index = it * dev_count + dev index = it * dev_count + dev
feed_dict = reader.make_one_batch_input(test_batches, index) feed_dict = reader.make_one_batch_input(test_batches, index)
feed_list.append(feed_dict) feed_list.append(feed_dict)
...@@ -190,9 +199,9 @@ def test(args): ...@@ -190,9 +199,9 @@ def test(args):
scores = np.array(predicts[0]) scores = np.array(predicts[0])
print("step = %d" % it) print("step = %d" % it)
for dev in xrange(dev_count): for dev in six.moves.xrange(dev_count):
index = it * dev_count + dev index = it * dev_count + dev
for i in xrange(args.batch_size): for i in six.moves.xrange(args.batch_size):
score_file.write( score_file.write(
str(scores[args.batch_size * dev + i][0]) + '\t' + str( str(scores[args.batch_size * dev + i][0]) + '\t' + str(
test_batches["label"][index][i]) + '\n') test_batches["label"][index][i]) + '\n')
......
import os import os
import six
import numpy as np import numpy as np
import time import time
import argparse import argparse
...@@ -6,9 +7,13 @@ import multiprocessing ...@@ -6,9 +7,13 @@ import multiprocessing
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import utils.reader as reader import utils.reader as reader
import cPickle as pickle
from utils.util import print_arguments from utils.util import print_arguments
try:
import cPickle as pickle #python 2
except ImportError as e:
import pickle #python 3
from model import Net from model import Net
...@@ -164,35 +169,45 @@ def train(args): ...@@ -164,35 +169,45 @@ def train(args):
if args.word_emb_init is not None: if args.word_emb_init is not None:
print("start loading word embedding init ...") print("start loading word embedding init ...")
word_emb = np.array(pickle.load(open(args.word_emb_init, 'rb'))).astype( if six.PY2:
word_emb = np.array(pickle.load(open(args.word_emb_init,
'rb'))).astype('float32')
else:
word_emb = np.array(
pickle.load(
open(args.word_emb_init, 'rb'), encoding="bytes")).astype(
'float32') 'float32')
dam.set_word_embedding(word_emb, place) dam.set_word_embedding(word_emb, place)
print("finish init word embedding ...") print("finish init word embedding ...")
print("start loading data ...") print("start loading data ...")
train_data, val_data, test_data = pickle.load(open(args.data_path, 'rb')) with open(args.data_path, 'rb') as f:
if six.PY2:
train_data, val_data, test_data = pickle.load(f)
else:
train_data, val_data, test_data = pickle.load(f, encoding="bytes")
print("finish loading data ...") print("finish loading data ...")
val_batches = reader.build_batches(val_data, data_conf) val_batches = reader.build_batches(val_data, data_conf)
batch_num = len(train_data['y']) / args.batch_size batch_num = len(train_data[six.b('y')]) // args.batch_size
val_batch_num = len(val_batches["response"]) val_batch_num = len(val_batches["response"])
print_step = max(1, batch_num / (dev_count * 100)) print_step = max(1, batch_num // (dev_count * 100))
save_step = max(1, batch_num / (dev_count * 10)) save_step = max(1, batch_num // (dev_count * 10))
print("begin model training ...") print("begin model training ...")
print(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(time.time()))) print(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(time.time())))
step = 0 step = 0
for epoch in xrange(args.num_scan_data): for epoch in six.moves.xrange(args.num_scan_data):
shuffle_train = reader.unison_shuffle(train_data) shuffle_train = reader.unison_shuffle(train_data)
train_batches = reader.build_batches(shuffle_train, data_conf) train_batches = reader.build_batches(shuffle_train, data_conf)
ave_cost = 0.0 ave_cost = 0.0
for it in xrange(batch_num // dev_count): for it in six.moves.xrange(batch_num // dev_count):
feed_list = [] feed_list = []
for dev in xrange(dev_count): for dev in six.moves.xrange(dev_count):
index = it * dev_count + dev index = it * dev_count + dev
feed_dict = reader.make_one_batch_input(train_batches, index) feed_dict = reader.make_one_batch_input(train_batches, index)
feed_list.append(feed_dict) feed_list.append(feed_dict)
...@@ -215,9 +230,9 @@ def train(args): ...@@ -215,9 +230,9 @@ def train(args):
score_path = os.path.join(args.save_path, 'score.' + str(step)) score_path = os.path.join(args.save_path, 'score.' + str(step))
score_file = open(score_path, 'w') score_file = open(score_path, 'w')
for it in xrange(val_batch_num // dev_count): for it in six.moves.xrange(val_batch_num // dev_count):
feed_list = [] feed_list = []
for dev in xrange(dev_count): for dev in six.moves.xrange(dev_count):
val_index = it * dev_count + dev val_index = it * dev_count + dev
feed_dict = reader.make_one_batch_input(val_batches, feed_dict = reader.make_one_batch_input(val_batches,
val_index) val_index)
...@@ -227,9 +242,9 @@ def train(args): ...@@ -227,9 +242,9 @@ def train(args):
fetch_list=[logits.name]) fetch_list=[logits.name])
scores = np.array(predicts[0]) scores = np.array(predicts[0])
for dev in xrange(dev_count): for dev in six.moves.xrange(dev_count):
val_index = it * dev_count + dev val_index = it * dev_count + dev
for i in xrange(args.batch_size): for i in six.moves.xrange(args.batch_size):
score_file.write( score_file.write(
str(scores[args.batch_size * dev + i][0]) + '\t' str(scores[args.batch_size * dev + i][0]) + '\t'
+ str(val_batches["label"][val_index][ + str(val_batches["label"][val_index][
......
import sys import sys
import six
import numpy as np import numpy as np
from sklearn.metrics import average_precision_score from sklearn.metrics import average_precision_score
...@@ -7,7 +8,7 @@ def mean_average_precision(sort_data): ...@@ -7,7 +8,7 @@ def mean_average_precision(sort_data):
#to do #to do
count_1 = 0 count_1 = 0
sum_precision = 0 sum_precision = 0
for index in range(len(sort_data)): for index in six.moves.xrange(len(sort_data)):
if sort_data[index][1] == 1: if sort_data[index][1] == 1:
count_1 += 1 count_1 += 1
sum_precision += 1.0 * count_1 / (index + 1) sum_precision += 1.0 * count_1 / (index + 1)
......
import sys import sys
import six
def get_p_at_n_in_m(data, n, m, ind): def get_p_at_n_in_m(data, n, m, ind):
...@@ -30,9 +31,9 @@ def evaluate(file_path): ...@@ -30,9 +31,9 @@ def evaluate(file_path):
p_at_2_in_10 = 0.0 p_at_2_in_10 = 0.0
p_at_5_in_10 = 0.0 p_at_5_in_10 = 0.0
length = len(data) / 10 length = len(data) // 10
for i in xrange(0, length): for i in six.moves.xrange(0, length):
ind = i * 10 ind = i * 10
assert data[ind][1] == 1 assert data[ind][1] == 1
......
import cPickle as pickle import six
import numpy as np import numpy as np
try:
import cPickle as pickle #python 2
except ImportError as e:
import pickle #python 3
def unison_shuffle(data, seed=None): def unison_shuffle(data, seed=None):
if seed is not None: if seed is not None:
np.random.seed(seed) np.random.seed(seed)
y = np.array(data['y']) y = np.array(data[six.b('y')])
c = np.array(data['c']) c = np.array(data[six.b('c')])
r = np.array(data['r']) r = np.array(data[six.b('r')])
assert len(y) == len(c) == len(r) assert len(y) == len(c) == len(r)
p = np.random.permutation(len(y)) p = np.random.permutation(len(y))
shuffle_data = {'y': y[p], 'c': c[p], 'r': r[p]} shuffle_data = {six.b('y'): y[p], six.b('c'): c[p], six.b('r'): r[p]}
return shuffle_data return shuffle_data
...@@ -65,9 +70,9 @@ def produce_one_sample(data, ...@@ -65,9 +70,9 @@ def produce_one_sample(data,
max_turn_len=50 max_turn_len=50
return y, nor_turns_nor_c, nor_r, turn_len, term_len, r_len return y, nor_turns_nor_c, nor_r, turn_len, term_len, r_len
''' '''
c = data['c'][index] c = data[six.b('c')][index]
r = data['r'][index][:] r = data[six.b('r')][index][:]
y = data['y'][index] y = data[six.b('y')][index]
turns = split_c(c, split_id) turns = split_c(c, split_id)
#normalize turns_c length, nor_turns length is max_turn_num #normalize turns_c length, nor_turns length is max_turn_num
...@@ -101,7 +106,7 @@ def build_one_batch(data, ...@@ -101,7 +106,7 @@ def build_one_batch(data,
_label = [] _label = []
for i in range(conf['batch_size']): for i in six.moves.xrange(conf['batch_size']):
index = batch_index * conf['batch_size'] + i index = batch_index * conf['batch_size'] + i
y, nor_turns_nor_c, nor_r, turn_len, term_len, r_len = produce_one_sample( y, nor_turns_nor_c, nor_r, turn_len, term_len, r_len = produce_one_sample(
data, index, conf['_EOS_'], conf['max_turn_num'], data, index, conf['_EOS_'], conf['max_turn_num'],
...@@ -145,8 +150,8 @@ def build_batches(data, conf, turn_cut_type='tail', term_cut_type='tail'): ...@@ -145,8 +150,8 @@ def build_batches(data, conf, turn_cut_type='tail', term_cut_type='tail'):
_label_batches = [] _label_batches = []
batch_len = len(data['y']) / conf['batch_size'] batch_len = len(data[six.b('y')]) // conf['batch_size']
for batch_index in range(batch_len): for batch_index in six.moves.range(batch_len):
_turns, _tt_turns_len, _every_turn_len, _response, _response_len, _label = build_one_batch( _turns, _tt_turns_len, _every_turn_len, _response, _response_len, _label = build_one_batch(
data, batch_index, conf, turn_cut_type='tail', term_cut_type='tail') data, batch_index, conf, turn_cut_type='tail', term_cut_type='tail')
...@@ -192,8 +197,10 @@ def make_one_batch_input(data_batches, index): ...@@ -192,8 +197,10 @@ def make_one_batch_input(data_batches, index):
max_turn_num = turns.shape[1] max_turn_num = turns.shape[1]
max_turn_len = turns.shape[2] max_turn_len = turns.shape[2]
turns_list = [turns[:, i, :] for i in xrange(max_turn_num)] turns_list = [turns[:, i, :] for i in six.moves.xrange(max_turn_num)]
every_turn_len_list = [every_turn_len[:, i] for i in xrange(max_turn_num)] every_turn_len_list = [
every_turn_len[:, i] for i in six.moves.xrange(max_turn_num)
]
feed_dict = {} feed_dict = {}
for i, turn in enumerate(turns_list): for i, turn in enumerate(turns_list):
...@@ -204,7 +211,7 @@ def make_one_batch_input(data_batches, index): ...@@ -204,7 +211,7 @@ def make_one_batch_input(data_batches, index):
for i, turn_len in enumerate(every_turn_len_list): for i, turn_len in enumerate(every_turn_len_list):
feed_dict["turn_mask_%d" % i] = np.ones( feed_dict["turn_mask_%d" % i] = np.ones(
(batch_size, max_turn_len, 1)).astype("float32") (batch_size, max_turn_len, 1)).astype("float32")
for row in xrange(batch_size): for row in six.moves.xrange(batch_size):
feed_dict["turn_mask_%d" % i][row, turn_len[row]:, 0] = 0 feed_dict["turn_mask_%d" % i][row, turn_len[row]:, 0] = 0
feed_dict["response"] = response feed_dict["response"] = response
...@@ -212,7 +219,7 @@ def make_one_batch_input(data_batches, index): ...@@ -212,7 +219,7 @@ def make_one_batch_input(data_batches, index):
feed_dict["response_mask"] = np.ones( feed_dict["response_mask"] = np.ones(
(batch_size, max_turn_len, 1)).astype("float32") (batch_size, max_turn_len, 1)).astype("float32")
for row in xrange(batch_size): for row in six.moves.xrange(batch_size):
feed_dict["response_mask"][row, response_len[row]:, 0] = 0 feed_dict["response_mask"][row, response_len[row]:, 0] = 0
feed_dict["label"] = np.array([data_batches["label"][index]]).reshape( feed_dict["label"] = np.array([data_batches["label"][index]]).reshape(
...@@ -228,14 +235,14 @@ if __name__ == '__main__': ...@@ -228,14 +235,14 @@ if __name__ == '__main__':
"max_turn_len": 50, "max_turn_len": 50,
"_EOS_": 28270, "_EOS_": 28270,
} }
train, val, test = pickle.load(open('../data/ubuntu/data_small.pkl', 'rb')) with open('../ubuntu/data/data_small.pkl', 'rb') as f:
if six.PY2:
train, val, test = pickle.load(f)
else:
train, val, test = pickle.load(f, encoding="bytes")
print('load data success') print('load data success')
train_batches = build_batches(train, conf) train_batches = build_batches(train, conf)
val_batches = build_batches(val, conf) val_batches = build_batches(val, conf)
test_batches = build_batches(test, conf) test_batches = build_batches(test, conf)
print('build batches success') print('build batches success')
pickle.dump([train_batches, val_batches, test_batches],
open('../data/ubuntu/data_small_xxx.pkl', 'wb'))
print('dump success')
import six
import os
def print_arguments(args): def print_arguments(args):
print('----------- Configuration Arguments -----------') print('----------- Configuration Arguments -----------')
for arg, value in sorted(vars(args).iteritems()): for arg, value in sorted(six.iteritems(vars(args))):
print('%s: %s' % (arg, value)) print('%s: %s' % (arg, value))
print('------------------------------------------------') print('------------------------------------------------')
def mkdir(path):
if not os.path.isdir(path):
mkdir(os.path.split(path)[0])
else:
return
os.mkdir(path)
def pos_encoding_init(): def pos_encoding_init():
pass pass
......
deeplabv3plus_xception65_initialize.params
deeplabv3plus.params
deeplabv3plus.tar.gz
DeepLab运行本目录下的程序示例需要使用PaddlePaddle develop最新版本。如果您的PaddlePaddle安装版本低于此要求,请按照[安装文档](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/pip_install_cn.html)中的说明更新PaddlePaddle安装版本。 DeepLab运行本目录下的程序示例需要使用PaddlePaddle Fluid v1.0.0版本或以上。如果您的PaddlePaddle安装版本低于此要求,请按照安装文档中的说明更新PaddlePaddle安装版本,如果使用GPU,该程序需要使用cuDNN v7版本。
## 代码结构 ## 代码结构
...@@ -41,10 +41,12 @@ data/cityscape/ ...@@ -41,10 +41,12 @@ data/cityscape/
如果需要从头开始训练模型,用户需要下载我们的初始化模型 如果需要从头开始训练模型,用户需要下载我们的初始化模型
``` ```
wget http://paddlemodels.cdn.bcebos.com/deeplab/deeplabv3plus_xception65_initialize.tar.gz wget http://paddlemodels.cdn.bcebos.com/deeplab/deeplabv3plus_xception65_initialize.tar.gz
tar -xf deeplabv3plus_xception65_initialize.tar.gz && rm deeplabv3plus_xception65_initialize.tar.gz
``` ```
如果需要最终训练模型进行fine tune或者直接用于预测,请下载我们的最终模型 如果需要最终训练模型进行fine tune或者直接用于预测,请下载我们的最终模型
``` ```
wget http://paddlemodels.cdn.bcebos.com/deeplab/deeplabv3plus.tar.gz wget http://paddlemodels.cdn.bcebos.com/deeplab/deeplabv3plus.tar.gz
tar -xf deeplabv3plus.tar.gz && rm deeplabv3plus.tar.gz
``` ```
...@@ -70,11 +72,11 @@ python train.py --help ...@@ -70,11 +72,11 @@ python train.py --help
``` ```
python ./train.py \ python ./train.py \
--batch_size=8 \ --batch_size=8 \
--parallel=true --parallel=true \
--train_crop_size=769 \ --train_crop_size=769 \
--total_step=90000 \ --total_step=90000 \
--init_weights_path=$INIT_WEIGHTS_PATH \ --init_weights_path=deeplabv3plus_xception65_initialize.params \
--save_weights_path=$SAVE_WEIGHTS_PATH \ --save_weights_path=output \
--dataset_path=$DATASET_PATH --dataset_path=$DATASET_PATH
``` ```
...@@ -82,11 +84,10 @@ python ./train.py \ ...@@ -82,11 +84,10 @@ python ./train.py \
执行以下命令在`Cityscape`测试数据集上进行测试: 执行以下命令在`Cityscape`测试数据集上进行测试:
``` ```
python ./eval.py \ python ./eval.py \
--init_weights_path=$INIT_WEIGHTS_PATH \ --init_weights=deeplabv3plus.params \
--dataset_path=$DATASET_PATH --dataset_path=$DATASET_PATH
``` ```
需要通过选项`--model_path`指定模型文件。 需要通过选项`--model_path`指定模型文件。测试脚本的输出的评估指标为mean IoU。
测试脚本的输出的评估指标为[mean IoU]()。
## 实验结果 ## 实验结果
......
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os import os
os.environ['FLAGS_fraction_of_gpu_memory_to_use'] = '0.98' os.environ['FLAGS_fraction_of_gpu_memory_to_use'] = '0.98'
...@@ -91,7 +94,7 @@ exe = fluid.Executor(place) ...@@ -91,7 +94,7 @@ exe = fluid.Executor(place)
exe.run(sp) exe.run(sp)
if args.init_weights_path: if args.init_weights_path:
print "load from:", args.init_weights_path print("load from:", args.init_weights_path)
load_model() load_model()
dataset = CityscapeDataset(args.dataset_path, 'val') dataset = CityscapeDataset(args.dataset_path, 'val')
...@@ -118,7 +121,7 @@ for i, imgs, labels, names in batches: ...@@ -118,7 +121,7 @@ for i, imgs, labels, names in batches:
mp = (wrong + right) != 0 mp = (wrong + right) != 0
miou2 = np.mean((right[mp] * 1.0 / (right[mp] + wrong[mp]))) miou2 = np.mean((right[mp] * 1.0 / (right[mp] + wrong[mp])))
if args.verbose: if args.verbose:
print 'step: %s, mIoU: %s' % (i + 1, miou2) print('step: %s, mIoU: %s' % (i + 1, miou2))
else: else:
print '\rstep: %s, mIoU: %s' % (i + 1, miou2), print('\rstep: %s, mIoU: %s' % (i + 1, miou2))
sys.stdout.flush() sys.stdout.flush()
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
...@@ -50,7 +53,7 @@ def append_op_result(result, name): ...@@ -50,7 +53,7 @@ def append_op_result(result, name):
def conv(*args, **kargs): def conv(*args, **kargs):
kargs['param_attr'] = name_scope + 'weights' kargs['param_attr'] = name_scope + 'weights'
if kargs.has_key('bias_attr') and kargs['bias_attr']: if 'bias_attr' in kargs and kargs['bias_attr']:
kargs['bias_attr'] = name_scope + 'biases' kargs['bias_attr'] = name_scope + 'biases'
else: else:
kargs['bias_attr'] = False kargs['bias_attr'] = False
...@@ -62,7 +65,7 @@ def group_norm(input, G, eps=1e-5, param_attr=None, bias_attr=None): ...@@ -62,7 +65,7 @@ def group_norm(input, G, eps=1e-5, param_attr=None, bias_attr=None):
N, C, H, W = input.shape N, C, H, W = input.shape
if C % G != 0: if C % G != 0:
print "group can not divide channle:", C, G print("group can not divide channle:", C, G)
for d in range(10): for d in range(10):
for t in [d, -d]: for t in [d, -d]:
if G + t <= 0: continue if G + t <= 0: continue
...@@ -70,7 +73,7 @@ def group_norm(input, G, eps=1e-5, param_attr=None, bias_attr=None): ...@@ -70,7 +73,7 @@ def group_norm(input, G, eps=1e-5, param_attr=None, bias_attr=None):
G = G + t G = G + t
break break
if C % G == 0: if C % G == 0:
print "use group size:", G print("use group size:", G)
break break
assert C % G == 0 assert C % G == 0
param_shape = (G, ) param_shape = (G, )
...@@ -139,7 +142,7 @@ def seq_conv(input, channel, stride, filter, dilation=1, act=None): ...@@ -139,7 +142,7 @@ def seq_conv(input, channel, stride, filter, dilation=1, act=None):
filter, filter,
stride, stride,
groups=input.shape[1], groups=input.shape[1],
padding=(filter / 2) * dilation, padding=(filter // 2) * dilation,
dilation=dilation) dilation=dilation)
input = bn(input) input = bn(input)
if act: input = act(input) if act: input = act(input)
......
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import cv2 import cv2
import numpy as np import numpy as np
import os
import six
default_config = { default_config = {
"shuffle": True, "shuffle": True,
...@@ -30,7 +35,7 @@ def slice_with_pad(a, s, value=0): ...@@ -30,7 +35,7 @@ def slice_with_pad(a, s, value=0):
pr = 0 pr = 0
pads.append([pl, pr]) pads.append([pl, pr])
slices.append([l, r]) slices.append([l, r])
slices = map(lambda x: slice(x[0], x[1], 1), slices) slices = list(map(lambda x: slice(x[0], x[1], 1), slices))
a = a[slices] a = a[slices]
a = np.pad(a, pad_width=pads, mode='constant', constant_values=value) a = np.pad(a, pad_width=pads, mode='constant', constant_values=value)
return a return a
...@@ -38,11 +43,17 @@ def slice_with_pad(a, s, value=0): ...@@ -38,11 +43,17 @@ def slice_with_pad(a, s, value=0):
class CityscapeDataset: class CityscapeDataset:
def __init__(self, dataset_dir, subset='train', config=default_config): def __init__(self, dataset_dir, subset='train', config=default_config):
label_dirname = os.path.join(dataset_dir, 'gtFine/' + subset)
if six.PY2:
import commands import commands
label_dirname = dataset_dir + 'gtFine/' + subset
label_files = commands.getoutput( label_files = commands.getoutput(
"find %s -type f | grep labelTrainIds | sort" % "find %s -type f | grep labelTrainIds | sort" %
label_dirname).splitlines() label_dirname).splitlines()
else:
import subprocess
label_files = subprocess.getstatusoutput(
"find %s -type f | grep labelTrainIds | sort" %
label_dirname)[-1].splitlines()
self.label_files = label_files self.label_files = label_files
self.label_dirname = label_dirname self.label_dirname = label_dirname
self.index = 0 self.index = 0
...@@ -50,7 +61,7 @@ class CityscapeDataset: ...@@ -50,7 +61,7 @@ class CityscapeDataset:
self.dataset_dir = dataset_dir self.dataset_dir = dataset_dir
self.config = config self.config = config
self.reset() self.reset()
print "total number", len(label_files) print("total number", len(label_files))
def reset(self, shuffle=False): def reset(self, shuffle=False):
self.index = 0 self.index = 0
...@@ -66,13 +77,14 @@ class CityscapeDataset: ...@@ -66,13 +77,14 @@ class CityscapeDataset:
shape = self.config["crop_size"] shape = self.config["crop_size"]
while True: while True:
ln = self.label_files[self.index] ln = self.label_files[self.index]
img_name = self.dataset_dir + 'leftImg8bit/' + self.subset + ln[len( img_name = os.path.join(
self.label_dirname):] self.dataset_dir,
'leftImg8bit/' + self.subset + ln[len(self.label_dirname):])
img_name = img_name.replace('gtFine_labelTrainIds', 'leftImg8bit') img_name = img_name.replace('gtFine_labelTrainIds', 'leftImg8bit')
label = cv2.imread(ln) label = cv2.imread(ln)
img = cv2.imread(img_name) img = cv2.imread(img_name)
if img is None: if img is None:
print "load img failed:", img_name print("load img failed:", img_name)
self.next_img() self.next_img()
else: else:
break break
...@@ -128,5 +140,7 @@ class CityscapeDataset: ...@@ -128,5 +140,7 @@ class CityscapeDataset:
from prefetch_generator import BackgroundGenerator from prefetch_generator import BackgroundGenerator
batches = BackgroundGenerator(batches, 100) batches = BackgroundGenerator(batches, 100)
except: except:
print "You can install 'prefetch_generator' for acceleration of data reading." print(
"You can install 'prefetch_generator' for acceleration of data reading."
)
return batches return batches
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os import os
os.environ['FLAGS_fraction_of_gpu_memory_to_use'] = '0.98' os.environ['FLAGS_fraction_of_gpu_memory_to_use'] = '0.98'
...@@ -126,13 +129,12 @@ exe = fluid.Executor(place) ...@@ -126,13 +129,12 @@ exe = fluid.Executor(place)
exe.run(sp) exe.run(sp)
if args.init_weights_path: if args.init_weights_path:
print "load from:", args.init_weights_path print("load from:", args.init_weights_path)
load_model() load_model()
dataset = CityscapeDataset(args.dataset_path, 'train') dataset = CityscapeDataset(args.dataset_path, 'train')
if args.parallel: if args.parallel:
print "Using ParallelExecutor."
exe_p = fluid.ParallelExecutor( exe_p = fluid.ParallelExecutor(
use_cuda=True, loss_name=loss_mean.name, main_program=tp) use_cuda=True, loss_name=loss_mean.name, main_program=tp)
...@@ -149,9 +151,9 @@ for i, imgs, labels, names in batches: ...@@ -149,9 +151,9 @@ for i, imgs, labels, names in batches:
'label': labels}, 'label': labels},
fetch_list=[pred, loss_mean]) fetch_list=[pred, loss_mean])
if i % 100 == 0: if i % 100 == 0:
print "Model is saved to", args.save_weights_path print("Model is saved to", args.save_weights_path)
save_model() save_model()
print "step %s, loss: %s" % (i, np.mean(retv[1])) print("step %s, loss: %s" % (i, np.mean(retv[1])))
print "Training done. Model is saved to", args.save_weights_path print("Training done. Model is saved to", args.save_weights_path)
save_model() save_model()
...@@ -10,3 +10,4 @@ output* ...@@ -10,3 +10,4 @@ output*
pred pred
eval_tools eval_tools
box* box*
PyramidBox_WiderFace*
...@@ -9,6 +9,7 @@ import time ...@@ -9,6 +9,7 @@ import time
import numpy as np import numpy as np
import threading import threading
import multiprocessing import multiprocessing
import traceback
try: try:
import queue import queue
except ImportError: except ImportError:
...@@ -71,6 +72,7 @@ class GeneratorEnqueuer(object): ...@@ -71,6 +72,7 @@ class GeneratorEnqueuer(object):
try: try:
task() task()
except Exception: except Exception:
traceback.print_exc()
self._stop_event.set() self._stop_event.set()
break break
else: else:
...@@ -78,6 +80,7 @@ class GeneratorEnqueuer(object): ...@@ -78,6 +80,7 @@ class GeneratorEnqueuer(object):
try: try:
task() task()
except Exception: except Exception:
traceback.print_exc()
self._stop_event.set() self._stop_event.set()
break break
......
...@@ -427,6 +427,7 @@ class PyramidBox(object): ...@@ -427,6 +427,7 @@ class PyramidBox(object):
overlap_threshold=0.35, overlap_threshold=0.35,
neg_overlap=0.35) neg_overlap=0.35)
loss = fluid.layers.reduce_sum(loss) loss = fluid.layers.reduce_sum(loss)
loss.persistable = True
return loss return loss
def train(self): def train(self):
......
...@@ -250,6 +250,10 @@ def train_generator(settings, file_list, batch_size, shuffle=True): ...@@ -250,6 +250,10 @@ def train_generator(settings, file_list, batch_size, shuffle=True):
ymin = float(temp_info_box[1]) ymin = float(temp_info_box[1])
w = float(temp_info_box[2]) w = float(temp_info_box[2])
h = float(temp_info_box[3]) h = float(temp_info_box[3])
# Filter out wrong labels
if w < 0 or h < 0:
continue
xmax = xmin + w xmax = xmin + w
ymax = ymin + h ymax = ymin + h
...@@ -294,7 +298,7 @@ def train(settings, ...@@ -294,7 +298,7 @@ def train(settings,
generator_output = enqueuer.queue.get() generator_output = enqueuer.queue.get()
break break
else: else:
time.sleep(0.02) time.sleep(0.01)
yield generator_output yield generator_output
generator_output = None generator_output = None
finally: finally:
......
...@@ -167,7 +167,7 @@ def train(args, config, train_params, train_file_list): ...@@ -167,7 +167,7 @@ def train(args, config, train_params, train_file_list):
shutil.rmtree(model_path) shutil.rmtree(model_path)
print('save models to %s' % (model_path)) print('save models to %s' % (model_path))
fluid.io.save_persistables(exe, model_path) fluid.io.save_persistables(exe, model_path, main_program=program)
train_py_reader.start() train_py_reader.start()
try: try:
...@@ -189,13 +189,13 @@ def train(args, config, train_params, train_file_list): ...@@ -189,13 +189,13 @@ def train(args, config, train_params, train_file_list):
fetch_vars = [np.mean(np.array(v)) for v in fetch_vars] fetch_vars = [np.mean(np.array(v)) for v in fetch_vars]
if batch_id % 10 == 0: if batch_id % 10 == 0:
if not args.use_pyramidbox: if not args.use_pyramidbox:
print("Pass {0}, batch {1}, loss {2}, time {3}".format( print("Pass {:d}, batch {:d}, loss {:.6f}, time {:.5f}".format(
pass_id, batch_id, fetch_vars[0], pass_id, batch_id, fetch_vars[0],
start_time - prev_start_time)) start_time - prev_start_time))
else: else:
print("Pass {0}, batch {1}, face loss {2}, " \ print("Pass {:d}, batch {:d}, face loss {:.6f}, " \
"head loss {3}, " \ "head loss {:.6f}, " \
"time {4}".format(pass_id, "time {:.5f}".format(pass_id,
batch_id, fetch_vars[0], fetch_vars[1], batch_id, fetch_vars[0], fetch_vars[1],
start_time - prev_start_time)) start_time - prev_start_time))
if pass_id % 1 == 0 or pass_id == epoc_num - 1: if pass_id % 1 == 0 or pass_id == epoc_num - 1:
......
...@@ -82,9 +82,6 @@ def save_widerface_bboxes(image_path, bboxes_scores, output_dir): ...@@ -82,9 +82,6 @@ def save_widerface_bboxes(image_path, bboxes_scores, output_dir):
image_name = image_path.split('/')[-1] image_name = image_path.split('/')[-1]
image_class = image_path.split('/')[-2] image_class = image_path.split('/')[-2]
image_name = image_name.encode('utf-8')
image_class = image_class.encode('utf-8')
odir = os.path.join(output_dir, image_class) odir = os.path.join(output_dir, image_class)
if not os.path.exists(odir): if not os.path.exists(odir):
os.makedirs(odir) os.makedirs(odir)
......
...@@ -43,7 +43,7 @@ After data preparation, one can start the training step by: ...@@ -43,7 +43,7 @@ After data preparation, one can start the training step by:
python train.py \ python train.py \
--max_size=1333 \ --max_size=1333 \
--scales=800 \ --scales=[800] \
--batch_size=8 \ --batch_size=8 \
--model_save_dir=output/ --model_save_dir=output/
...@@ -57,6 +57,22 @@ After data preparation, one can start the training step by: ...@@ -57,6 +57,22 @@ After data preparation, one can start the training step by:
sh ./pretrained/download.sh sh ./pretrained/download.sh
Set `pretrained_model` to load pre-trained model. In addition, this parameter is used to load trained model when finetuning as well. Set `pretrained_model` to load pre-trained model. In addition, this parameter is used to load trained model when finetuning as well.
Please make sure that pretrained_model is downloaded and loaded correctly, otherwise, the loss may be NAN during training.
**Install the [cocoapi](https://github.com/cocodataset/cocoapi):**
To train the model, [cocoapi](https://github.com/cocodataset/cocoapi) is needed. Install the cocoapi:
# COCOAPI=/path/to/clone/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
# if cython is not installed
pip install Cython
# Install into global site-packages
make install
# Alternatively, if you do not have permissions or prefer
# not to install the COCO API into global site-packages
python2 setup.py install --user
**data reader introduction:** **data reader introduction:**
...@@ -103,18 +119,7 @@ Finetuning is to finetune model weights in a specific task by loading pretrained ...@@ -103,18 +119,7 @@ Finetuning is to finetune model weights in a specific task by loading pretrained
## Evaluation ## Evaluation
Evaluation is to evaluate the performance of a trained model. This sample provides `eval_coco_map.py` which uses a COCO-specific mAP metric defined by [COCO committee](http://cocodataset.org/#detections-eval). To use `eval_coco_map.py` , [cocoapi](https://github.com/cocodataset/cocoapi) is needed. Install the cocoapi: Evaluation is to evaluate the performance of a trained model. This sample provides `eval_coco_map.py` which uses a COCO-specific mAP metric defined by [COCO committee](http://cocodataset.org/#detections-eval).
# COCOAPI=/path/to/clone/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
# if cython is not installed
pip install Cython
# Install into global site-packages
make install
# Alternatively, if you do not have permissions or prefer
# not to install the COCO API into global site-packages
python2 setup.py install --user
`eval_coco_map.py` is the main executor for evalution, one can start evalution step by: `eval_coco_map.py` is the main executor for evalution, one can start evalution step by:
...@@ -136,7 +141,7 @@ Faster RCNN mAP ...@@ -136,7 +141,7 @@ Faster RCNN mAP
| Detectron | 8 | 180000 | 0.315 | | Detectron | 8 | 180000 | 0.315 |
| Fluid minibatch padding | 8 | 180000 | 0.314 | | Fluid minibatch padding | 8 | 180000 | 0.314 |
| Fluid all padding | 8 | 180000 | 0.308 | | Fluid all padding | 8 | 180000 | 0.308 |
| Fluid no padding |6 | 240000 | 0.317 | | Fluid no padding |8 | 180000 | 0.316 |
* Fluid all padding: Each image padding to 1333\*1333. * Fluid all padding: Each image padding to 1333\*1333.
* Fluid minibatch padding: Images in one batch padding to the same size. This method is same as detectron. * Fluid minibatch padding: Images in one batch padding to the same size. This method is same as detectron.
......
...@@ -42,7 +42,7 @@ Faster RCNN 目标检测模型 ...@@ -42,7 +42,7 @@ Faster RCNN 目标检测模型
python train.py \ python train.py \
--max_size=1333 \ --max_size=1333 \
--scales=800 \ --scales=[800] \
--batch_size=8 \ --batch_size=8 \
--model_save_dir=output/ \ --model_save_dir=output/ \
--pretrained_model=${path_to_pretrain_model} --pretrained_model=${path_to_pretrain_model}
...@@ -57,6 +57,22 @@ Faster RCNN 目标检测模型 ...@@ -57,6 +57,22 @@ Faster RCNN 目标检测模型
sh ./pretrained/download.sh sh ./pretrained/download.sh
通过初始化`pretrained_model` 加载预训练模型。同时在参数微调时也采用该设置加载已训练模型。 通过初始化`pretrained_model` 加载预训练模型。同时在参数微调时也采用该设置加载已训练模型。
请在训练前确认预训练模型下载与加载正确,否则训练过程中损失可能会出现NAN。
**安装[cocoapi](https://github.com/cocodataset/cocoapi):**
训练前需要首先下载[cocoapi](https://github.com/cocodataset/cocoapi)
# COCOAPI=/path/to/clone/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
# if cython is not installed
pip install Cython
# Install into global site-packages
make install
# Alternatively, if you do not have permissions or prefer
# not to install the COCO API into global site-packages
python2 setup.py install --user
**数据读取器说明:** 数据读取器定义在reader.py中。所有图像将短边等比例缩放至`scales`,若长边大于`max_size`, 则再次将长边等比例缩放至`max_iter`。在训练阶段,对图像采用水平翻转。支持将同一个batch内的图像padding为相同尺寸。 **数据读取器说明:** 数据读取器定义在reader.py中。所有图像将短边等比例缩放至`scales`,若长边大于`max_size`, 则再次将长边等比例缩放至`max_iter`。在训练阶段,对图像采用水平翻转。支持将同一个batch内的图像padding为相同尺寸。
...@@ -87,18 +103,7 @@ Faster RCNN 训练loss ...@@ -87,18 +103,7 @@ Faster RCNN 训练loss
## 模型评估 ## 模型评估
模型评估是指对训练完毕的模型评估各类性能指标。本示例采用[COCO官方评估](http://cocodataset.org/#detections-eval),使用前需要首先下载[cocoapi](https://github.com/cocodataset/cocoapi) 模型评估是指对训练完毕的模型评估各类性能指标。本示例采用[COCO官方评估](http://cocodataset.org/#detections-eval)
# COCOAPI=/path/to/clone/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
# if cython is not installed
pip install Cython
# Install into global site-packages
make install
# Alternatively, if you do not have permissions or prefer
# not to install the COCO API into global site-packages
python2 setup.py install --user
`eval_coco_map.py`是评估模块的主要执行程序,调用示例如下: `eval_coco_map.py`是评估模块的主要执行程序,调用示例如下:
...@@ -120,7 +125,7 @@ Faster RCNN mAP ...@@ -120,7 +125,7 @@ Faster RCNN mAP
| Detectron | 8 | 180000 | 0.315 | | Detectron | 8 | 180000 | 0.315 |
| Fluid minibatch padding | 8 | 180000 | 0.314 | | Fluid minibatch padding | 8 | 180000 | 0.314 |
| Fluid all padding | 8 | 180000 | 0.308 | | Fluid all padding | 8 | 180000 | 0.308 |
| Fluid no padding |6 | 240000 | 0.317 | | Fluid no padding |8 | 180000 | 0.316 |
* Fluid all padding: 每张图像填充为1333\*1333大小。 * Fluid all padding: 每张图像填充为1333\*1333大小。
* Fluid minibatch padding: 同一个batch内的图像填充为相同尺寸。该方法与detectron处理相同。 * Fluid minibatch padding: 同一个batch内的图像填充为相同尺寸。该方法与detectron处理相同。
......
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved.
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals
from edict import AttrDict
import six
import numpy as np
_C = AttrDict()
cfg = _C
#
# Training options
#
_C.TRAIN = AttrDict()
# scales an image's shortest side
_C.TRAIN.scales = [800]
# max size of longest side
_C.TRAIN.max_size = 1333
# images per GPU in minibatch
_C.TRAIN.im_per_batch = 1
# roi minibatch size per image
_C.TRAIN.batch_size_per_im = 512
# target fraction of foreground roi minibatch
_C.TRAIN.fg_fractrion = 0.25
# overlap threshold for a foreground roi
_C.TRAIN.fg_thresh = 0.5
# overlap threshold for a background roi
_C.TRAIN.bg_thresh_hi = 0.5
_C.TRAIN.bg_thresh_lo = 0.0
# If False, only resize image and not pad, image shape is different between
# GPUs in one mini-batch. If True, image shape is the same in one mini-batch.
_C.TRAIN.padding_minibatch = False
# Snapshot period
_C.TRAIN.snapshot_iter = 10000
# number of RPN proposals to keep before NMS
_C.TRAIN.rpn_pre_nms_top_n = 12000
# number of RPN proposals to keep after NMS
_C.TRAIN.rpn_post_nms_top_n = 2000
# NMS threshold used on RPN proposals
_C.TRAIN.rpn_nms_thresh = 0.7
# min size in RPN proposals
_C.TRAIN.rpn_min_size = 0.0
# eta for adaptive NMS in RPN
_C.TRAIN.rpn_eta = 1.0
# number of RPN examples per image
_C.TRAIN.rpn_batch_size_per_im = 256
# remove anchors out of the image
_C.TRAIN.rpn_straddle_thresh = 0.
# target fraction of foreground examples pre RPN minibatch
_C.TRAIN.rpn_fg_fraction = 0.5
# min overlap between anchor and gt box to be a positive examples
_C.TRAIN.rpn_positive_overlap = 0.7
# max overlap between anchor and gt box to be a negative examples
_C.TRAIN.rpn_negative_overlap = 0.3
# stopgrad at a specified stage
_C.TRAIN.freeze_at = 2
# min area of ground truth box
_C.TRAIN.gt_min_area = -1
#
# Inference options
#
_C.TEST = AttrDict()
# scales an image's shortest side
_C.TEST.scales = [800]
# max size of longest side
_C.TEST.max_size = 1333
# eta for adaptive NMS in RPN
_C.TEST.rpn_eta = 1.0
# min score threshold to infer
_C.TEST.score_thresh = 0.05
# overlap threshold used for NMS
_C.TEST.nms_thresh = 0.5
# number of RPN proposals to keep before NMS
_C.TEST.rpn_pre_nms_top_n = 6000
# number of RPN proposals to keep after NMS
_C.TEST.rpn_post_nms_top_n = 1000
# min size in RPN proposals
_C.TEST.rpn_min_size = 0.0
# max number of detections
_C.TEST.detectiions_per_im = 100
# NMS threshold used on RPN proposals
_C.TEST.rpn_nms_thresh = 0.7
#
# Model options
#
# weight for bbox regression targets
_C.bbox_reg_weights = [0.1, 0.1, 0.2, 0.2]
# RPN anchor sizes
_C.anchor_sizes = [32, 64, 128, 256, 512]
# RPN anchor ratio
_C.aspect_ratio = [0.5, 1, 2]
# variance of anchors
_C.variances = [1., 1., 1., 1.]
# stride of feature map
_C.rpn_stride = [16.0, 16.0]
# Use roi pool or roi align, 'RoIPool' or 'RoIAlign'
_C.roi_func = 'RoIAlign'
# sampling ratio for roi align
_C.sampling_ratio = 0
# pooled width and pooled height
_C.roi_resolution = 14
# spatial scale
_C.spatial_scale = 1. / 16.
#
# SOLVER options
#
# derived learning rate the to get the final learning rate.
_C.learning_rate = 0.01
# maximum number of iterations
_C.max_iter = 180000
# warm up to learning rate
_C.warm_up_iter = 500
_C.warm_up_factor = 1. / 3.
# lr steps_with_decay
_C.lr_steps = [120000, 160000]
_C.lr_gamma = 0.1
# L2 regularization hyperparameter
_C.weight_decay = 0.0001
# momentum with SGD
_C.momentum = 0.9
#
# ENV options
#
# support both CPU and GPU
_C.use_gpu = True
# Whether use parallel
_C.parallel = True
# Class number
_C.class_num = 81
# support pyreader
_C.use_pyreader = True
# pixel mean values
_C.pixel_means = [102.9801, 115.9465, 122.7717]
# clip box to prevent overflowing
_C.bbox_clip = np.log(1000. / 16.)
# dataset path
_C.train_file_list = 'annotations/instances_train2017.json'
_C.train_data_dir = 'train2017'
_C.val_file_list = 'annotations/instances_val2017.json'
_C.val_data_dir = 'val2017'
def merge_cfg_from_args(args, mode):
"""Merge config keys, values in args into the global config."""
if mode == 'train':
sub_d = _C.TRAIN
else:
sub_d = _C.TEST
for k, v in sorted(six.iteritems(vars(args))):
d = _C
try:
value = eval(v)
except:
value = v
if k in sub_d:
sub_d[k] = value
else:
d[k] = value
...@@ -27,21 +27,27 @@ from __future__ import unicode_literals ...@@ -27,21 +27,27 @@ from __future__ import unicode_literals
import cv2 import cv2
import numpy as np import numpy as np
from config import cfg
def get_image_blob(roidb, settings): def get_image_blob(roidb, mode):
"""Builds an input blob from the images in the roidb at the specified """Builds an input blob from the images in the roidb at the specified
scales. scales.
""" """
scale_ind = np.random.randint(0, high=len(settings.scales)) if mode == 'train':
scales = cfg.TRAIN.scales
scale_ind = np.random.randint(0, high=len(scales))
target_size = scales[scale_ind]
max_size = cfg.TRAIN.max_size
else:
target_size = cfg.TEST.scales[0]
max_size = cfg.TEST.max_size
im = cv2.imread(roidb['image']) im = cv2.imread(roidb['image'])
assert im is not None, \ assert im is not None, \
'Failed to read image \'{}\''.format(roidb['image']) 'Failed to read image \'{}\''.format(roidb['image'])
if roidb['flipped']: if roidb['flipped']:
im = im[:, ::-1, :] im = im[:, ::-1, :]
target_size = settings.scales[scale_ind] im, im_scale = prep_im_for_blob(im, cfg.pixel_means, target_size, max_size)
im, im_scale = prep_im_for_blob(im, settings.mean_value, target_size,
settings.max_size)
return im, im_scale return im, im_scale
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals
class AttrDict(dict):
def __init__(self, *args, **kwargs):
super(AttrDict, self).__init__(*args, **kwargs)
def __getattr__(self, name):
if name in self.__dict__:
return self.__dict__[name]
elif name in self:
return self[name]
else:
raise AttributeError(name)
def __setattr__(self, name, value):
if name in self.__dict__:
self.__dict__[name] = value
else:
self[name] = value
...@@ -29,18 +29,20 @@ import models.resnet as resnet ...@@ -29,18 +29,20 @@ import models.resnet as resnet
import json import json
from pycocotools.coco import COCO from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval, Params from pycocotools.cocoeval import COCOeval, Params
from config import cfg
def eval(cfg): def eval():
if '2014' in cfg.dataset: if '2014' in cfg.dataset:
test_list = 'annotations/instances_val2014.json' test_list = 'annotations/instances_val2014.json'
elif '2017' in cfg.dataset: elif '2017' in cfg.dataset:
test_list = 'annotations/instances_val2017.json' test_list = 'annotations/instances_val2017.json'
image_shape = [3, cfg.max_size, cfg.max_size] image_shape = [3, cfg.TEST.max_size, cfg.TEST.max_size]
class_nums = cfg.class_num class_nums = cfg.class_num
batch_size = cfg.batch_size devices = os.getenv("CUDA_VISIBLE_DEVICES") or ""
devices_num = len(devices.split(","))
total_batch_size = devices_num * cfg.TRAIN.im_per_batch
cocoGt = COCO(os.path.join(cfg.data_dir, test_list)) cocoGt = COCO(os.path.join(cfg.data_dir, test_list))
numId_to_catId_map = {i + 1: v for i, v in enumerate(cocoGt.getCatIds())} numId_to_catId_map = {i + 1: v for i, v in enumerate(cocoGt.getCatIds())}
category_ids = cocoGt.getCatIds() category_ids = cocoGt.getCatIds()
...@@ -51,7 +53,6 @@ def eval(cfg): ...@@ -51,7 +53,6 @@ def eval(cfg):
label_list[0] = ['background'] label_list[0] = ['background']
model = model_builder.FasterRCNN( model = model_builder.FasterRCNN(
cfg=cfg,
add_conv_body_func=resnet.add_ResNet50_conv4_body, add_conv_body_func=resnet.add_ResNet50_conv4_body,
add_roi_box_head_func=resnet.add_ResNet_roi_conv5_head, add_roi_box_head_func=resnet.add_ResNet_roi_conv5_head,
use_pyreader=False, use_pyreader=False,
...@@ -66,7 +67,7 @@ def eval(cfg): ...@@ -66,7 +67,7 @@ def eval(cfg):
return os.path.exists(os.path.join(cfg.pretrained_model, var.name)) return os.path.exists(os.path.join(cfg.pretrained_model, var.name))
fluid.io.load_vars(exe, cfg.pretrained_model, predicate=if_exist) fluid.io.load_vars(exe, cfg.pretrained_model, predicate=if_exist)
# yapf: enable # yapf: enable
test_reader = reader.test(cfg, batch_size) test_reader = reader.test(total_batch_size)
feeder = fluid.DataFeeder(place=place, feed_list=model.feeds()) feeder = fluid.DataFeeder(place=place, feed_list=model.feeds())
dts_res = [] dts_res = []
...@@ -80,11 +81,11 @@ def eval(cfg): ...@@ -80,11 +81,11 @@ def eval(cfg):
fetch_list=[v.name for v in fetch_list], fetch_list=[v.name for v in fetch_list],
feed=feeder.feed(batch_data), feed=feeder.feed(batch_data),
return_numpy=False) return_numpy=False)
new_lod, nmsed_out = get_nmsed_box(cfg, rpn_rois_v, confs_v, locs_v, new_lod, nmsed_out = get_nmsed_box(rpn_rois_v, confs_v, locs_v,
class_nums, im_info, class_nums, im_info,
numId_to_catId_map) numId_to_catId_map)
dts_res += get_dt_res(batch_size, new_lod, nmsed_out, batch_data) dts_res += get_dt_res(total_batch_size, new_lod, nmsed_out, batch_data)
end = time.time() end = time.time()
print('batch id: {}, time: {}'.format(batch_id, end - start)) print('batch id: {}, time: {}'.format(batch_id, end - start))
with open("detection_result.json", 'w') as outfile: with open("detection_result.json", 'w') as outfile:
...@@ -100,6 +101,4 @@ def eval(cfg): ...@@ -100,6 +101,4 @@ def eval(cfg):
if __name__ == '__main__': if __name__ == '__main__':
args = parse_args() args = parse_args()
print_arguments(args) print_arguments(args)
eval()
data_args = reader.Settings(args)
eval(data_args)
...@@ -20,6 +20,7 @@ import box_utils ...@@ -20,6 +20,7 @@ import box_utils
from PIL import Image from PIL import Image
from PIL import ImageDraw from PIL import ImageDraw
from PIL import ImageFont from PIL import ImageFont
from config import cfg
def box_decoder(target_box, prior_box, prior_box_var): def box_decoder(target_box, prior_box, prior_box_var):
...@@ -31,10 +32,8 @@ def box_decoder(target_box, prior_box, prior_box_var): ...@@ -31,10 +32,8 @@ def box_decoder(target_box, prior_box, prior_box_var):
prior_box_loc[:, 3] = (prior_box[:, 3] + prior_box[:, 1]) / 2 prior_box_loc[:, 3] = (prior_box[:, 3] + prior_box[:, 1]) / 2
pred_bbox = np.zeros_like(target_box, dtype=np.float32) pred_bbox = np.zeros_like(target_box, dtype=np.float32)
for i in range(prior_box.shape[0]): for i in range(prior_box.shape[0]):
dw = np.minimum(prior_box_var[2] * target_box[i, 2::4], dw = np.minimum(prior_box_var[2] * target_box[i, 2::4], cfg.bbox_clip)
np.log(1000. / 16.)) dh = np.minimum(prior_box_var[3] * target_box[i, 3::4], cfg.bbox_clip)
dh = np.minimum(prior_box_var[3] * target_box[i, 3::4],
np.log(1000. / 16.))
pred_bbox[i, 0::4] = prior_box_var[0] * target_box[ pred_bbox[i, 0::4] = prior_box_var[0] * target_box[
i, 0::4] * prior_box_loc[i, 0] + prior_box_loc[i, 2] i, 0::4] * prior_box_loc[i, 0] + prior_box_loc[i, 2]
pred_bbox[i, 1::4] = prior_box_var[1] * target_box[ pred_bbox[i, 1::4] = prior_box_var[1] * target_box[
...@@ -67,11 +66,11 @@ def clip_tiled_boxes(boxes, im_shape): ...@@ -67,11 +66,11 @@ def clip_tiled_boxes(boxes, im_shape):
return boxes return boxes
def get_nmsed_box(args, rpn_rois, confs, locs, class_nums, im_info, def get_nmsed_box(rpn_rois, confs, locs, class_nums, im_info,
numId_to_catId_map): numId_to_catId_map):
lod = rpn_rois.lod()[0] lod = rpn_rois.lod()[0]
rpn_rois_v = np.array(rpn_rois) rpn_rois_v = np.array(rpn_rois)
variance_v = np.array([0.1, 0.1, 0.2, 0.2]) variance_v = np.array(cfg.bbox_reg_weights)
confs_v = np.array(confs) confs_v = np.array(confs)
locs_v = np.array(locs) locs_v = np.array(locs)
rois = box_decoder(locs_v, rpn_rois_v, variance_v) rois = box_decoder(locs_v, rpn_rois_v, variance_v)
...@@ -89,12 +88,12 @@ def get_nmsed_box(args, rpn_rois, confs, locs, class_nums, im_info, ...@@ -89,12 +88,12 @@ def get_nmsed_box(args, rpn_rois, confs, locs, class_nums, im_info,
cls_boxes = [[] for _ in range(class_nums)] cls_boxes = [[] for _ in range(class_nums)]
scores_n = confs_v[start:end, :] scores_n = confs_v[start:end, :]
for j in range(1, class_nums): for j in range(1, class_nums):
inds = np.where(scores_n[:, j] > args.score_threshold)[0] inds = np.where(scores_n[:, j] > cfg.TEST.score_thresh)[0]
scores_j = scores_n[inds, j] scores_j = scores_n[inds, j]
rois_j = rois_n[inds, j * 4:(j + 1) * 4] rois_j = rois_n[inds, j * 4:(j + 1) * 4]
dets_j = np.hstack((rois_j, scores_j[:, np.newaxis])).astype( dets_j = np.hstack((rois_j, scores_j[:, np.newaxis])).astype(
np.float32, copy=False) np.float32, copy=False)
keep = box_utils.nms(dets_j, args.nms_threshold) keep = box_utils.nms(dets_j, cfg.TEST.nms_thresh)
nms_dets = dets_j[keep, :] nms_dets = dets_j[keep, :]
#add labels #add labels
cat_id = numId_to_catId_map[j] cat_id = numId_to_catId_map[j]
...@@ -105,8 +104,8 @@ def get_nmsed_box(args, rpn_rois, confs, locs, class_nums, im_info, ...@@ -105,8 +104,8 @@ def get_nmsed_box(args, rpn_rois, confs, locs, class_nums, im_info,
# Limit to max_per_image detections **over all classes** # Limit to max_per_image detections **over all classes**
image_scores = np.hstack( image_scores = np.hstack(
[cls_boxes[j][:, -2] for j in range(1, class_nums)]) [cls_boxes[j][:, -2] for j in range(1, class_nums)])
if len(image_scores) > 100: if len(image_scores) > cfg.TEST.detectiions_per_im:
image_thresh = np.sort(image_scores)[-100] image_thresh = np.sort(image_scores)[-cfg.TEST.detectiions_per_im]
for j in range(1, class_nums): for j in range(1, class_nums):
keep = np.where(cls_boxes[j][:, -2] >= image_thresh)[0] keep = np.where(cls_boxes[j][:, -2] >= image_thresh)[0]
cls_boxes[j] = cls_boxes[j][keep, :] cls_boxes[j] = cls_boxes[j][keep, :]
......
fluid/faster_rcnn/image/mAP.jpg

80.0 KB | W: | H:

fluid/faster_rcnn/image/mAP.jpg

41.0 KB | W: | H:

fluid/faster_rcnn/image/mAP.jpg
fluid/faster_rcnn/image/mAP.jpg
fluid/faster_rcnn/image/mAP.jpg
fluid/faster_rcnn/image/mAP.jpg
  • 2-up
  • Swipe
  • Onion skin
import os
import time
import numpy as np
from eval_helper import get_nmsed_box
from eval_helper import get_dt_res
from eval_helper import draw_bounding_box_on_image
import paddle
import paddle.fluid as fluid
import reader
from utility import print_arguments, parse_args
import models.model_builder as model_builder
import models.resnet as resnet
import json
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval, Params
from config import cfg
def infer():
if '2014' in cfg.dataset:
test_list = 'annotations/instances_val2014.json'
elif '2017' in cfg.dataset:
test_list = 'annotations/instances_val2017.json'
cocoGt = COCO(os.path.join(cfg.data_dir, test_list))
numId_to_catId_map = {i + 1: v for i, v in enumerate(cocoGt.getCatIds())}
category_ids = cocoGt.getCatIds()
label_list = {
item['id']: item['name']
for item in cocoGt.loadCats(category_ids)
}
label_list[0] = ['background']
image_shape = [3, cfg.TEST.max_size, cfg.TEST.max_size]
class_nums = cfg.class_num
model = model_builder.FasterRCNN(
add_conv_body_func=resnet.add_ResNet50_conv4_body,
add_roi_box_head_func=resnet.add_ResNet_roi_conv5_head,
use_pyreader=False,
is_train=False)
model.build_model(image_shape)
rpn_rois, confs, locs = model.eval_out()
place = fluid.CUDAPlace(0) if cfg.use_gpu else fluid.CPUPlace()
exe = fluid.Executor(place)
# yapf: disable
if cfg.pretrained_model:
def if_exist(var):
return os.path.exists(os.path.join(cfg.pretrained_model, var.name))
fluid.io.load_vars(exe, cfg.pretrained_model, predicate=if_exist)
# yapf: enable
infer_reader = reader.infer()
feeder = fluid.DataFeeder(place=place, feed_list=model.feeds())
dts_res = []
fetch_list = [rpn_rois, confs, locs]
data = next(infer_reader())
im_info = [data[0][1]]
rpn_rois_v, confs_v, locs_v = exe.run(
fetch_list=[v.name for v in fetch_list],
feed=feeder.feed(data),
return_numpy=False)
new_lod, nmsed_out = get_nmsed_box(rpn_rois_v, confs_v, locs_v, class_nums,
im_info, numId_to_catId_map)
path = os.path.join(cfg.image_path, cfg.image_name)
draw_bounding_box_on_image(path, nmsed_out, cfg.draw_threshold, label_list)
if __name__ == '__main__':
args = parse_args()
print_arguments(args)
infer()
...@@ -17,11 +17,11 @@ from paddle.fluid.param_attr import ParamAttr ...@@ -17,11 +17,11 @@ from paddle.fluid.param_attr import ParamAttr
from paddle.fluid.initializer import Constant from paddle.fluid.initializer import Constant
from paddle.fluid.initializer import Normal from paddle.fluid.initializer import Normal
from paddle.fluid.regularizer import L2Decay from paddle.fluid.regularizer import L2Decay
from config import cfg
class FasterRCNN(object): class FasterRCNN(object):
def __init__(self, def __init__(self,
cfg=None,
add_conv_body_func=None, add_conv_body_func=None,
add_roi_box_head_func=None, add_roi_box_head_func=None,
is_train=True, is_train=True,
...@@ -29,7 +29,6 @@ class FasterRCNN(object): ...@@ -29,7 +29,6 @@ class FasterRCNN(object):
use_random=True): use_random=True):
self.add_conv_body_func = add_conv_body_func self.add_conv_body_func = add_conv_body_func
self.add_roi_box_head_func = add_roi_box_head_func self.add_roi_box_head_func = add_roi_box_head_func
self.cfg = cfg
self.is_train = is_train self.is_train = is_train
self.use_pyreader = use_pyreader self.use_pyreader = use_pyreader
self.use_random = use_random self.use_random = use_random
...@@ -111,10 +110,10 @@ class FasterRCNN(object): ...@@ -111,10 +110,10 @@ class FasterRCNN(object):
name="conv_rpn_b", learning_rate=2., regularizer=L2Decay(0.))) name="conv_rpn_b", learning_rate=2., regularizer=L2Decay(0.)))
self.anchor, self.var = fluid.layers.anchor_generator( self.anchor, self.var = fluid.layers.anchor_generator(
input=rpn_conv, input=rpn_conv,
anchor_sizes=self.cfg.anchor_sizes, anchor_sizes=cfg.anchor_sizes,
aspect_ratios=self.cfg.aspect_ratios, aspect_ratios=cfg.aspect_ratio,
variance=self.cfg.variance, variance=cfg.variances,
stride=[16.0, 16.0]) stride=cfg.rpn_stride)
num_anchor = self.anchor.shape[2] num_anchor = self.anchor.shape[2]
# Proposal classification scores # Proposal classification scores
self.rpn_cls_score = fluid.layers.conv2d( self.rpn_cls_score = fluid.layers.conv2d(
...@@ -152,8 +151,12 @@ class FasterRCNN(object): ...@@ -152,8 +151,12 @@ class FasterRCNN(object):
rpn_cls_score_prob = fluid.layers.sigmoid( rpn_cls_score_prob = fluid.layers.sigmoid(
self.rpn_cls_score, name='rpn_cls_score_prob') self.rpn_cls_score, name='rpn_cls_score_prob')
pre_nms_top_n = 12000 if self.is_train else 6000 param_obj = cfg.TRAIN if self.is_train else cfg.TEST
post_nms_top_n = 2000 if self.is_train else 1000 pre_nms_top_n = param_obj.rpn_pre_nms_top_n
post_nms_top_n = param_obj.rpn_post_nms_top_n
nms_thresh = param_obj.rpn_nms_thresh
min_size = param_obj.rpn_min_size
eta = param_obj.rpn_eta
rpn_rois, rpn_roi_probs = fluid.layers.generate_proposals( rpn_rois, rpn_roi_probs = fluid.layers.generate_proposals(
scores=rpn_cls_score_prob, scores=rpn_cls_score_prob,
bbox_deltas=self.rpn_bbox_pred, bbox_deltas=self.rpn_bbox_pred,
...@@ -162,9 +165,9 @@ class FasterRCNN(object): ...@@ -162,9 +165,9 @@ class FasterRCNN(object):
variances=self.var, variances=self.var,
pre_nms_top_n=pre_nms_top_n, pre_nms_top_n=pre_nms_top_n,
post_nms_top_n=post_nms_top_n, post_nms_top_n=post_nms_top_n,
nms_thresh=0.7, nms_thresh=nms_thresh,
min_size=0.0, min_size=min_size,
eta=1.0) eta=eta)
self.rpn_rois = rpn_rois self.rpn_rois = rpn_rois
if self.is_train: if self.is_train:
outs = fluid.layers.generate_proposal_labels( outs = fluid.layers.generate_proposal_labels(
...@@ -173,13 +176,13 @@ class FasterRCNN(object): ...@@ -173,13 +176,13 @@ class FasterRCNN(object):
is_crowd=self.is_crowd, is_crowd=self.is_crowd,
gt_boxes=self.gt_box, gt_boxes=self.gt_box,
im_info=self.im_info, im_info=self.im_info,
batch_size_per_im=self.cfg.batch_size_per_im, batch_size_per_im=cfg.TRAIN.batch_size_per_im,
fg_fraction=0.25, fg_fraction=cfg.TRAIN.fg_fractrion,
fg_thresh=0.5, fg_thresh=cfg.TRAIN.fg_thresh,
bg_thresh_hi=0.5, bg_thresh_hi=cfg.TRAIN.bg_thresh_hi,
bg_thresh_lo=0.0, bg_thresh_lo=cfg.TRAIN.bg_thresh_lo,
bbox_reg_weights=[0.1, 0.1, 0.2, 0.2], bbox_reg_weights=cfg.bbox_reg_weights,
class_nums=self.cfg.class_num, class_nums=cfg.class_num,
use_random=self.use_random) use_random=self.use_random)
self.rois = outs[0] self.rois = outs[0]
...@@ -193,15 +196,24 @@ class FasterRCNN(object): ...@@ -193,15 +196,24 @@ class FasterRCNN(object):
pool_rois = self.rois pool_rois = self.rois
else: else:
pool_rois = self.rpn_rois pool_rois = self.rpn_rois
if cfg.roi_func == 'RoIPool':
pool = fluid.layers.roi_pool( pool = fluid.layers.roi_pool(
input=roi_input, input=roi_input,
rois=pool_rois, rois=pool_rois,
pooled_height=14, pooled_height=cfg.roi_resolution,
pooled_width=14, pooled_width=cfg.roi_resolution,
spatial_scale=0.0625) spatial_scale=cfg.spatial_scale)
elif cfg.roi_func == 'RoIAlign':
pool = fluid.layers.roi_align(
input=roi_input,
rois=pool_rois,
pooled_height=cfg.roi_resolution,
pooled_width=cfg.roi_resolution,
spatial_scale=cfg.spatial_scale,
sampling_ratio=cfg.sampling_ratio)
rcnn_out = self.add_roi_box_head_func(pool) rcnn_out = self.add_roi_box_head_func(pool)
self.cls_score = fluid.layers.fc(input=rcnn_out, self.cls_score = fluid.layers.fc(input=rcnn_out,
size=self.cfg.class_num, size=cfg.class_num,
act=None, act=None,
name='cls_score', name='cls_score',
param_attr=ParamAttr( param_attr=ParamAttr(
...@@ -213,7 +225,7 @@ class FasterRCNN(object): ...@@ -213,7 +225,7 @@ class FasterRCNN(object):
learning_rate=2., learning_rate=2.,
regularizer=L2Decay(0.))) regularizer=L2Decay(0.)))
self.bbox_pred = fluid.layers.fc(input=rcnn_out, self.bbox_pred = fluid.layers.fc(input=rcnn_out,
size=4 * self.cfg.class_num, size=4 * cfg.class_num,
act=None, act=None,
name='bbox_pred', name='bbox_pred',
param_attr=ParamAttr( param_attr=ParamAttr(
...@@ -257,8 +269,7 @@ class FasterRCNN(object): ...@@ -257,8 +269,7 @@ class FasterRCNN(object):
x=rpn_cls_score_reshape, shape=(0, -1, 1)) x=rpn_cls_score_reshape, shape=(0, -1, 1))
rpn_bbox_pred_reshape = fluid.layers.reshape( rpn_bbox_pred_reshape = fluid.layers.reshape(
x=rpn_bbox_pred_reshape, shape=(0, -1, 4)) x=rpn_bbox_pred_reshape, shape=(0, -1, 4))
score_pred, loc_pred, score_tgt, loc_tgt, bbox_weight = \
score_pred, loc_pred, score_tgt, loc_tgt = \
fluid.layers.rpn_target_assign( fluid.layers.rpn_target_assign(
bbox_pred=rpn_bbox_pred_reshape, bbox_pred=rpn_bbox_pred_reshape,
cls_logits=rpn_cls_score_reshape, cls_logits=rpn_cls_score_reshape,
...@@ -267,11 +278,11 @@ class FasterRCNN(object): ...@@ -267,11 +278,11 @@ class FasterRCNN(object):
gt_boxes=self.gt_box, gt_boxes=self.gt_box,
is_crowd=self.is_crowd, is_crowd=self.is_crowd,
im_info=self.im_info, im_info=self.im_info,
rpn_batch_size_per_im=256, rpn_batch_size_per_im=cfg.TRAIN.rpn_batch_size_per_im,
rpn_straddle_thresh=0.0, rpn_straddle_thresh=cfg.TRAIN.rpn_straddle_thresh,
rpn_fg_fraction=0.5, rpn_fg_fraction=cfg.TRAIN.rpn_fg_fraction,
rpn_positive_overlap=0.7, rpn_positive_overlap=cfg.TRAIN.rpn_positive_overlap,
rpn_negative_overlap=0.3, rpn_negative_overlap=cfg.TRAIN.rpn_negative_overlap,
use_random=self.use_random) use_random=self.use_random)
score_tgt = fluid.layers.cast(x=score_tgt, dtype='float32') score_tgt = fluid.layers.cast(x=score_tgt, dtype='float32')
rpn_cls_loss = fluid.layers.sigmoid_cross_entropy_with_logits( rpn_cls_loss = fluid.layers.sigmoid_cross_entropy_with_logits(
...@@ -279,7 +290,12 @@ class FasterRCNN(object): ...@@ -279,7 +290,12 @@ class FasterRCNN(object):
rpn_cls_loss = fluid.layers.reduce_mean( rpn_cls_loss = fluid.layers.reduce_mean(
rpn_cls_loss, name='loss_rpn_cls') rpn_cls_loss, name='loss_rpn_cls')
rpn_reg_loss = fluid.layers.smooth_l1(x=loc_pred, y=loc_tgt, sigma=3.0) rpn_reg_loss = fluid.layers.smooth_l1(
x=loc_pred,
y=loc_tgt,
sigma=3.0,
inside_weight=bbox_weight,
outside_weight=bbox_weight)
rpn_reg_loss = fluid.layers.reduce_sum( rpn_reg_loss = fluid.layers.reduce_sum(
rpn_reg_loss, name='loss_rpn_bbox') rpn_reg_loss, name='loss_rpn_bbox')
score_shape = fluid.layers.shape(score_tgt) score_shape = fluid.layers.shape(score_tgt)
......
...@@ -16,6 +16,7 @@ import paddle.fluid as fluid ...@@ -16,6 +16,7 @@ import paddle.fluid as fluid
from paddle.fluid.param_attr import ParamAttr from paddle.fluid.param_attr import ParamAttr
from paddle.fluid.initializer import Constant from paddle.fluid.initializer import Constant
from paddle.fluid.regularizer import L2Decay from paddle.fluid.regularizer import L2Decay
from config import cfg
def conv_bn_layer(input, def conv_bn_layer(input,
...@@ -88,8 +89,7 @@ def conv_affine_layer(input, ...@@ -88,8 +89,7 @@ def conv_affine_layer(input,
default_initializer=Constant(0.)) default_initializer=Constant(0.))
bias.stop_gradient = True bias.stop_gradient = True
elt_mul = fluid.layers.elementwise_mul(x=conv, y=scale, axis=1) out = fluid.layers.affine_channel(x=conv, scale=scale, bias=bias)
out = fluid.layers.elementwise_add(x=elt_mul, y=bias, axis=1)
if act == 'relu': if act == 'relu':
out = fluid.layers.relu(x=out) out = fluid.layers.relu(x=out)
return out return out
...@@ -137,7 +137,7 @@ ResNet_cfg = { ...@@ -137,7 +137,7 @@ ResNet_cfg = {
} }
def add_ResNet50_conv4_body(body_input, freeze_at=2): def add_ResNet50_conv4_body(body_input):
stages, block_func = ResNet_cfg[50] stages, block_func = ResNet_cfg[50]
stages = stages[0:3] stages = stages[0:3]
conv1 = conv_affine_layer( conv1 = conv_affine_layer(
...@@ -149,13 +149,13 @@ def add_ResNet50_conv4_body(body_input, freeze_at=2): ...@@ -149,13 +149,13 @@ def add_ResNet50_conv4_body(body_input, freeze_at=2):
pool_stride=2, pool_stride=2,
pool_padding=1) pool_padding=1)
res2 = layer_warp(block_func, pool1, 64, stages[0], 1, name="res2") res2 = layer_warp(block_func, pool1, 64, stages[0], 1, name="res2")
if freeze_at == 2: if cfg.TRAIN.freeze_at == 2:
res2.stop_gradient = True res2.stop_gradient = True
res3 = layer_warp(block_func, res2, 128, stages[1], 2, name="res3") res3 = layer_warp(block_func, res2, 128, stages[1], 2, name="res3")
if freeze_at == 3: if cfg.TRAIN.freeze_at == 3:
res3.stop_gradient = True res3.stop_gradient = True
res4 = layer_warp(block_func, res3, 256, stages[2], 2, name="res4") res4 = layer_warp(block_func, res3, 256, stages[2], 2, name="res4")
if freeze_at == 4: if cfg.TRAIN.freeze_at == 4:
res4.stop_gradient = True res4.stop_gradient = True
return res4 return res4
......
DIR="$( cd "$(dirname "$0")" ; pwd -P )" DIR="$(dirname "$PWD -P")"
cd "$DIR" cd "$DIR"
# Download the data. # Download the data.
......
...@@ -26,19 +26,18 @@ import paddle.fluid.profiler as profiler ...@@ -26,19 +26,18 @@ import paddle.fluid.profiler as profiler
import models.model_builder as model_builder import models.model_builder as model_builder
import models.resnet as resnet import models.resnet as resnet
from learning_rate import exponential_with_warmup_decay from learning_rate import exponential_with_warmup_decay
from config import cfg
def train(cfg): def train():
batch_size = cfg.batch_size
learning_rate = cfg.learning_rate learning_rate = cfg.learning_rate
image_shape = [3, cfg.max_size, cfg.max_size] image_shape = [3, cfg.TRAIN.max_size, cfg.TRAIN.max_size]
num_iterations = cfg.max_iter num_iterations = cfg.max_iter
devices = os.getenv("CUDA_VISIBLE_DEVICES") or "" devices = os.getenv("CUDA_VISIBLE_DEVICES") or ""
devices_num = len(devices.split(",")) devices_num = len(devices.split(","))
total_batch_size = devices_num * cfg.TRAIN.im_per_batch
model = model_builder.FasterRCNN( model = model_builder.FasterRCNN(
cfg=cfg,
add_conv_body_func=resnet.add_ResNet50_conv4_body, add_conv_body_func=resnet.add_ResNet50_conv4_body,
add_roi_box_head_func=resnet.add_ResNet_roi_conv5_head, add_roi_box_head_func=resnet.add_ResNet_roi_conv5_head,
use_pyreader=cfg.use_pyreader, use_pyreader=cfg.use_pyreader,
...@@ -51,8 +50,10 @@ def train(cfg): ...@@ -51,8 +50,10 @@ def train(cfg):
rpn_reg_loss.persistable = True rpn_reg_loss.persistable = True
loss = loss_cls + loss_bbox + rpn_cls_loss + rpn_reg_loss loss = loss_cls + loss_bbox + rpn_cls_loss + rpn_reg_loss
boundaries = [120000, 160000] boundaries = cfg.lr_steps
values = [learning_rate, learning_rate * 0.1, learning_rate * 0.01] gamma = cfg.lr_gamma
step_num = len(lr_steps)
values = [learning_rate * (gamma**i) for i in range(step_num + 1)]
optimizer = fluid.optimizer.Momentum( optimizer = fluid.optimizer.Momentum(
learning_rate=exponential_with_warmup_decay( learning_rate=exponential_with_warmup_decay(
...@@ -82,22 +83,16 @@ def train(cfg): ...@@ -82,22 +83,16 @@ def train(cfg):
train_exe = fluid.ParallelExecutor( train_exe = fluid.ParallelExecutor(
use_cuda=bool(cfg.use_gpu), loss_name=loss.name) use_cuda=bool(cfg.use_gpu), loss_name=loss.name)
assert cfg.batch_size % devices_num == 0, \
"batch_size = %d, devices_num = %d" %(cfg.batch_size, devices_num)
batch_size_per_dev = cfg.batch_size / devices_num
if cfg.use_pyreader: if cfg.use_pyreader:
train_reader = reader.train( train_reader = reader.train(
cfg, batch_size=cfg.TRAIN.im_per_batch,
batch_size=batch_size_per_dev, total_batch_size=total_batch_size,
total_batch_size=cfg.batch_size, padding_total=cfg.TRAIN.padding_minibatch,
padding_total=cfg.padding_minibatch,
shuffle=False) shuffle=False)
py_reader = model.py_reader py_reader = model.py_reader
py_reader.decorate_paddle_reader(train_reader) py_reader.decorate_paddle_reader(train_reader)
else: else:
train_reader = reader.train( train_reader = reader.train(batch_size=total_batch_size, shuffle=False)
cfg, batch_size=cfg.batch_size, shuffle=False)
feeder = fluid.DataFeeder(place=place, feed_list=model.feeds()) feeder = fluid.DataFeeder(place=place, feed_list=model.feeds())
fetch_list = [loss, loss_cls, loss_bbox, rpn_cls_loss, rpn_reg_loss] fetch_list = [loss, loss_cls, loss_bbox, rpn_cls_loss, rpn_reg_loss]
...@@ -109,7 +104,7 @@ def train(cfg): ...@@ -109,7 +104,7 @@ def train(cfg):
for batch_id in range(iterations): for batch_id in range(iterations):
start_time = time.time() start_time = time.time()
data = train_reader().next() data = next(train_reader())
end_time = time.time() end_time = time.time()
reader_time.append(end_time - start_time) reader_time.append(end_time - start_time)
start_time = time.time() start_time = time.time()
...@@ -163,8 +158,7 @@ def train(cfg): ...@@ -163,8 +158,7 @@ def train(cfg):
run_func(2) run_func(2)
# profiling # profiling
start = time.time() start = time.time()
use_profile = False if cfg.use_profile:
if use_profile:
with profiler.profiler('GPU', 'total', '/tmp/profile_file'): with profiler.profiler('GPU', 'total', '/tmp/profile_file'):
reader_time, run_time, total_images = run_func(num_iterations) reader_time, run_time, total_images = run_func(num_iterations)
else: else:
...@@ -181,6 +175,4 @@ def train(cfg): ...@@ -181,6 +175,4 @@ def train(cfg):
if __name__ == '__main__': if __name__ == '__main__':
args = parse_args() args = parse_args()
print_arguments(args) print_arguments(args)
train()
data_args = reader.Settings(args)
train(data_args)
...@@ -26,58 +26,45 @@ from collections import deque ...@@ -26,58 +26,45 @@ from collections import deque
from roidbs import JsonDataset from roidbs import JsonDataset
import data_utils import data_utils
from config import cfg
class Settings(object): def coco(mode,
def __init__(self, args=None):
for arg, value in sorted(six.iteritems(vars(args))):
setattr(self, arg, value)
if 'coco2014' in args.dataset:
self.class_nums = 81
self.train_file_list = 'annotations/instances_train2014.json'
self.train_data_dir = 'train2014'
self.val_file_list = 'annotations/instances_val2014.json'
self.val_data_dir = 'val2014'
elif 'coco2017' in args.dataset:
self.class_nums = 81
self.train_file_list = 'annotations/instances_train2017.json'
self.train_data_dir = 'train2017'
self.val_file_list = 'annotations/instances_val2017.json'
self.val_data_dir = 'val2017'
else:
raise NotImplementedError('Dataset {} not supported'.format(
self.dataset))
self.mean_value = np.array(self.mean_value)[
np.newaxis, np.newaxis, :].astype('float32')
def coco(settings,
mode,
batch_size=None, batch_size=None,
total_batch_size=None, total_batch_size=None,
padding_total=False, padding_total=False,
shuffle=False): shuffle=False):
if 'coco2014' in cfg.dataset:
cfg.train_file_list = 'annotations/instances_train2014.json'
cfg.train_data_dir = 'train2014'
cfg.val_file_list = 'annotations/instances_val2014.json'
cfg.val_data_dir = 'val2014'
elif 'coco2017' in cfg.dataset:
cfg.train_file_list = 'annotations/instances_train2017.json'
cfg.train_data_dir = 'train2017'
cfg.val_file_list = 'annotations/instances_val2017.json'
cfg.val_data_dir = 'val2017'
else:
raise NotImplementedError('Dataset {} not supported'.format(
cfg.dataset))
cfg.mean_value = np.array(cfg.pixel_means)[np.newaxis,
np.newaxis, :].astype('float32')
total_batch_size = total_batch_size if total_batch_size else batch_size total_batch_size = total_batch_size if total_batch_size else batch_size
if mode != 'infer': if mode != 'infer':
assert total_batch_size % batch_size == 0 assert total_batch_size % batch_size == 0
if mode == 'train': if mode == 'train':
settings.train_file_list = os.path.join(settings.data_dir, cfg.train_file_list = os.path.join(cfg.data_dir, cfg.train_file_list)
settings.train_file_list) cfg.train_data_dir = os.path.join(cfg.data_dir, cfg.train_data_dir)
settings.train_data_dir = os.path.join(settings.data_dir,
settings.train_data_dir)
elif mode == 'test' or mode == 'infer': elif mode == 'test' or mode == 'infer':
settings.val_file_list = os.path.join(settings.data_dir, cfg.val_file_list = os.path.join(cfg.data_dir, cfg.val_file_list)
settings.val_file_list) cfg.val_data_dir = os.path.join(cfg.data_dir, cfg.val_data_dir)
settings.val_data_dir = os.path.join(settings.data_dir, json_dataset = JsonDataset(train=(mode == 'train'))
settings.val_data_dir)
json_dataset = JsonDataset(settings, train=(mode == 'train'))
roidbs = json_dataset.get_roidb() roidbs = json_dataset.get_roidb()
print("{} on {} with {} roidbs".format(mode, settings.dataset, len(roidbs))) print("{} on {} with {} roidbs".format(mode, cfg.dataset, len(roidbs)))
def roidb_reader(roidb, mode): def roidb_reader(roidb, mode):
im, im_scales = data_utils.get_image_blob(roidb, settings) im, im_scales = data_utils.get_image_blob(roidb, mode)
im_id = roidb['id'] im_id = roidb['id']
im_height = np.round(roidb['height'] * im_scales) im_height = np.round(roidb['height'] * im_scales)
im_width = np.round(roidb['width'] * im_scales) im_width = np.round(roidb['width'] * im_scales)
...@@ -150,7 +137,7 @@ def coco(settings, ...@@ -150,7 +137,7 @@ def coco(settings,
else: else:
for roidb in roidbs: for roidb in roidbs:
if settings.image_name not in roidb['image']: if cfg.image_name not in roidb['image']:
continue continue
im, im_info, im_id = roidb_reader(roidb, mode) im, im_info, im_id = roidb_reader(roidb, mode)
batch_out = [(im, im_info, im_id)] batch_out = [(im, im_info, im_id)]
...@@ -159,23 +146,14 @@ def coco(settings, ...@@ -159,23 +146,14 @@ def coco(settings,
return reader return reader
def train(settings, def train(batch_size, total_batch_size=None, padding_total=False, shuffle=True):
batch_size,
total_batch_size=None,
padding_total=False,
shuffle=True):
return coco( return coco(
settings, 'train', batch_size, total_batch_size, padding_total, shuffle=shuffle)
'train',
batch_size,
total_batch_size,
padding_total,
shuffle=shuffle)
def test(settings, batch_size, total_batch_size=None, padding_total=False): def test(batch_size, total_batch_size=None, padding_total=False):
return coco(settings, 'test', batch_size, total_batch_size, shuffle=False) return coco('test', batch_size, total_batch_size, shuffle=False)
def infer(settings): def infer():
return coco(settings, 'infer') return coco('infer')
...@@ -26,7 +26,6 @@ from __future__ import print_function ...@@ -26,7 +26,6 @@ from __future__ import print_function
from __future__ import unicode_literals from __future__ import unicode_literals
import copy import copy
import cPickle as pickle
import logging import logging
import numpy as np import numpy as np
import os import os
...@@ -37,6 +36,7 @@ import matplotlib ...@@ -37,6 +36,7 @@ import matplotlib
matplotlib.use('Agg') matplotlib.use('Agg')
from pycocotools.coco import COCO from pycocotools.coco import COCO
import box_utils import box_utils
from config import cfg
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
...@@ -44,16 +44,16 @@ logger = logging.getLogger(__name__) ...@@ -44,16 +44,16 @@ logger = logging.getLogger(__name__)
class JsonDataset(object): class JsonDataset(object):
"""A class representing a COCO json dataset.""" """A class representing a COCO json dataset."""
def __init__(self, args, train=False): def __init__(self, train=False):
print('Creating: {}'.format(args.dataset)) print('Creating: {}'.format(cfg.dataset))
self.name = args.dataset self.name = cfg.dataset
self.is_train = train self.is_train = train
if self.is_train: if self.is_train:
data_dir = args.train_data_dir data_dir = cfg.train_data_dir
file_list = args.train_file_list file_list = cfg.train_file_list
else: else:
data_dir = args.val_data_dir data_dir = cfg.val_data_dir
file_list = args.val_file_list file_list = cfg.val_file_list
self.image_directory = data_dir self.image_directory = data_dir
self.COCO = COCO(file_list) self.COCO = COCO(file_list)
# Set up dataset classes # Set up dataset classes
...@@ -91,7 +91,6 @@ class JsonDataset(object): ...@@ -91,7 +91,6 @@ class JsonDataset(object):
end_time = time.time() end_time = time.time()
print('_add_gt_annotations took {:.3f}s'.format(end_time - print('_add_gt_annotations took {:.3f}s'.format(end_time -
start_time)) start_time))
print('Appending horizontally-flipped training examples...') print('Appending horizontally-flipped training examples...')
self._extend_with_flipped_entries(roidb) self._extend_with_flipped_entries(roidb)
print('Loaded dataset: {:s}'.format(self.name)) print('Loaded dataset: {:s}'.format(self.name))
...@@ -130,7 +129,7 @@ class JsonDataset(object): ...@@ -130,7 +129,7 @@ class JsonDataset(object):
width = entry['width'] width = entry['width']
height = entry['height'] height = entry['height']
for obj in objs: for obj in objs:
if obj['area'] < -1: #cfg.TRAIN.GT_MIN_AREA: if obj['area'] < cfg.TRAIN.gt_min_area:
continue continue
if 'ignore' in obj and obj['ignore'] == 1: if 'ignore' in obj and obj['ignore'] == 1:
continue continue
......
...@@ -28,11 +28,12 @@ import reader ...@@ -28,11 +28,12 @@ import reader
import models.model_builder as model_builder import models.model_builder as model_builder
import models.resnet as resnet import models.resnet as resnet
from learning_rate import exponential_with_warmup_decay from learning_rate import exponential_with_warmup_decay
from config import cfg
def train(cfg): def train():
learning_rate = cfg.learning_rate learning_rate = cfg.learning_rate
image_shape = [3, cfg.max_size, cfg.max_size] image_shape = [3, cfg.TRAIN.max_size, cfg.TRAIN.max_size]
if cfg.debug: if cfg.debug:
fluid.default_startup_program().random_seed = 1000 fluid.default_startup_program().random_seed = 1000
...@@ -43,9 +44,9 @@ def train(cfg): ...@@ -43,9 +44,9 @@ def train(cfg):
devices = os.getenv("CUDA_VISIBLE_DEVICES") or "" devices = os.getenv("CUDA_VISIBLE_DEVICES") or ""
devices_num = len(devices.split(",")) devices_num = len(devices.split(","))
total_batch_size = devices_num * cfg.TRAIN.im_per_batch
model = model_builder.FasterRCNN( model = model_builder.FasterRCNN(
cfg=cfg,
add_conv_body_func=resnet.add_ResNet50_conv4_body, add_conv_body_func=resnet.add_ResNet50_conv4_body,
add_roi_box_head_func=resnet.add_ResNet_roi_conv5_head, add_roi_box_head_func=resnet.add_ResNet_roi_conv5_head,
use_pyreader=cfg.use_pyreader, use_pyreader=cfg.use_pyreader,
...@@ -58,18 +59,20 @@ def train(cfg): ...@@ -58,18 +59,20 @@ def train(cfg):
rpn_reg_loss.persistable = True rpn_reg_loss.persistable = True
loss = loss_cls + loss_bbox + rpn_cls_loss + rpn_reg_loss loss = loss_cls + loss_bbox + rpn_cls_loss + rpn_reg_loss
boundaries = [120000, 160000] boundaries = cfg.lr_steps
values = [learning_rate, learning_rate * 0.1, learning_rate * 0.01] gamma = cfg.lr_gamma
step_num = len(cfg.lr_steps)
values = [learning_rate * (gamma**i) for i in range(step_num + 1)]
optimizer = fluid.optimizer.Momentum( optimizer = fluid.optimizer.Momentum(
learning_rate=exponential_with_warmup_decay( learning_rate=exponential_with_warmup_decay(
learning_rate=learning_rate, learning_rate=learning_rate,
boundaries=boundaries, boundaries=boundaries,
values=values, values=values,
warmup_iter=500, warmup_iter=cfg.warm_up_iter,
warmup_factor=1.0 / 3.0), warmup_factor=cfg.warm_up_factor),
regularization=fluid.regularizer.L2Decay(0.0001), regularization=fluid.regularizer.L2Decay(cfg.weight_decay),
momentum=0.9) momentum=cfg.momentum)
optimizer.minimize(loss) optimizer.minimize(loss)
fluid.memory_optimize(fluid.default_main_program()) fluid.memory_optimize(fluid.default_main_program())
...@@ -89,20 +92,16 @@ def train(cfg): ...@@ -89,20 +92,16 @@ def train(cfg):
train_exe = fluid.ParallelExecutor( train_exe = fluid.ParallelExecutor(
use_cuda=bool(cfg.use_gpu), loss_name=loss.name) use_cuda=bool(cfg.use_gpu), loss_name=loss.name)
assert cfg.batch_size % devices_num == 0
batch_size_per_dev = cfg.batch_size / devices_num
if cfg.use_pyreader: if cfg.use_pyreader:
train_reader = reader.train( train_reader = reader.train(
cfg, batch_size=cfg.TRAIN.im_per_batch,
batch_size=batch_size_per_dev, total_batch_size=total_batch_size,
total_batch_size=cfg.batch_size, padding_total=cfg.TRAIN.padding_minibatch,
padding_total=cfg.padding_minibatch,
shuffle=True) shuffle=True)
py_reader = model.py_reader py_reader = model.py_reader
py_reader.decorate_paddle_reader(train_reader) py_reader.decorate_paddle_reader(train_reader)
else: else:
train_reader = reader.train( train_reader = reader.train(batch_size=total_batch_size, shuffle=True)
cfg, batch_size=cfg.batch_size, shuffle=True)
feeder = fluid.DataFeeder(place=place, feed_list=model.feeds()) feeder = fluid.DataFeeder(place=place, feed_list=model.feeds())
def save_model(postfix): def save_model(postfix):
...@@ -133,7 +132,7 @@ def train(cfg): ...@@ -133,7 +132,7 @@ def train(cfg):
smoothed_loss.get_median_value( smoothed_loss.get_median_value(
), start_time - prev_start_time)) ), start_time - prev_start_time))
sys.stdout.flush() sys.stdout.flush()
if (iter_id + 1) % cfg.snapshot_stride == 0: if (iter_id + 1) % cfg.TRAIN.snapshot_iter == 0:
save_model("model_iter{}".format(iter_id)) save_model("model_iter{}".format(iter_id))
except fluid.core.EOFException: except fluid.core.EOFException:
py_reader.reset() py_reader.reset()
...@@ -159,7 +158,7 @@ def train(cfg): ...@@ -159,7 +158,7 @@ def train(cfg):
iter_id, lr[0], iter_id, lr[0],
smoothed_loss.get_median_value(), start_time - prev_start_time)) smoothed_loss.get_median_value(), start_time - prev_start_time))
sys.stdout.flush() sys.stdout.flush()
if (iter_id + 1) % cfg.snapshot_stride == 0: if (iter_id + 1) % cfg.TRAIN.snapshot_iter == 0:
save_model("model_iter{}".format(iter_id)) save_model("model_iter{}".format(iter_id))
if (iter_id + 1) == cfg.max_iter: if (iter_id + 1) == cfg.max_iter:
break break
...@@ -175,6 +174,4 @@ def train(cfg): ...@@ -175,6 +174,4 @@ def train(cfg):
if __name__ == '__main__': if __name__ == '__main__':
args = parse_args() args = parse_args()
print_arguments(args) print_arguments(args)
train()
data_args = reader.Settings(args)
train(data_args)
...@@ -18,7 +18,7 @@ Contains common utility functions. ...@@ -18,7 +18,7 @@ Contains common utility functions.
from __future__ import absolute_import from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
import sys
import distutils.util import distutils.util
import numpy as np import numpy as np
import six import six
...@@ -26,6 +26,7 @@ from collections import deque ...@@ -26,6 +26,7 @@ from collections import deque
from paddle.fluid import core from paddle.fluid import core
import argparse import argparse
import functools import functools
from config import *
def print_arguments(args): def print_arguments(args):
...@@ -96,31 +97,33 @@ def parse_args(): ...@@ -96,31 +97,33 @@ def parse_args():
add_arg('model_save_dir', str, 'output', "The path to save model.") add_arg('model_save_dir', str, 'output', "The path to save model.")
add_arg('pretrained_model', str, 'imagenet_resnet50_fusebn', "The init model path.") add_arg('pretrained_model', str, 'imagenet_resnet50_fusebn', "The init model path.")
add_arg('dataset', str, 'coco2017', "coco2014, coco2017.") add_arg('dataset', str, 'coco2017', "coco2014, coco2017.")
add_arg('data_dir', str, 'data/COCO17', "The data root path.")
add_arg('class_num', int, 81, "Class number.") add_arg('class_num', int, 81, "Class number.")
add_arg('data_dir', str, 'data/COCO17', "The data root path.")
add_arg('use_pyreader', bool, True, "Use pyreader.") add_arg('use_pyreader', bool, True, "Use pyreader.")
add_arg('use_profile', bool, False, "Whether use profiler.")
add_arg('padding_minibatch',bool, False, add_arg('padding_minibatch',bool, False,
"If False, only resize image and not pad, image shape is different between" "If False, only resize image and not pad, image shape is different between"
" GPUs in one mini-batch. If True, image shape is the same in one mini-batch.") " GPUs in one mini-batch. If True, image shape is the same in one mini-batch.")
#SOLVER #SOLVER
add_arg('learning_rate', float, 0.01, "Learning rate.") add_arg('learning_rate', float, 0.01, "Learning rate.")
add_arg('max_iter', int, 180000, "Iter number.") add_arg('max_iter', int, 180000, "Iter number.")
add_arg('log_window', int, 1, "Log smooth window, set 1 for debug, set 20 for train.") add_arg('log_window', int, 20, "Log smooth window, set 1 for debug, set 20 for train.")
add_arg('snapshot_stride', int, 10000, "save model every snapshot stride.") # FAST RCNN
# RPN # RPN
add_arg('anchor_sizes', int, [32,64,128,256,512], "The size of anchors.") add_arg('anchor_sizes', int, [32,64,128,256,512], "The size of anchors.")
add_arg('aspect_ratios', float, [0.5,1.0,2.0], "The ratio of anchors.") add_arg('aspect_ratios', float, [0.5,1.0,2.0], "The ratio of anchors.")
add_arg('variance', float, [1.,1.,1.,1.], "The variance of anchors.") add_arg('variance', float, [1.,1.,1.,1.], "The variance of anchors.")
add_arg('rpn_stride', float, 16., "Stride of the feature map that RPN is attached.") add_arg('rpn_stride', float, [16.,16.], "Stride of the feature map that RPN is attached.")
# FAST RCNN add_arg('rpn_nms_thresh', float, 0.7, "NMS threshold used on RPN proposals")
# TRAIN TEST INFER # TRAIN TEST INFER
add_arg('batch_size', int, 1, "Minibatch size.") add_arg('im_per_batch', int, 1, "Minibatch size.")
add_arg('max_size', int, 1333, "The resized image height.") add_arg('max_size', int, 1333, "The resized image height.")
add_arg('scales', int, [800], "The resized image height.") add_arg('scales', int, [800], "The resized image height.")
add_arg('batch_size_per_im',int, 512, "fast rcnn head batch size") add_arg('batch_size_per_im',int, 512, "fast rcnn head batch size")
add_arg('mean_value', float, [102.9801, 115.9465, 122.7717], "pixel mean") add_arg('pixel_means', float, [102.9801, 115.9465, 122.7717], "pixel mean")
add_arg('nms_threshold', float, 0.5, "NMS threshold.") add_arg('nms_thresh', float, 0.5, "NMS threshold.")
add_arg('score_threshold', float, 0.05, "score threshold for NMS.") add_arg('score_thresh', float, 0.05, "score threshold for NMS.")
add_arg('snapshot_stride', int, 10000, "save model every snapshot stride.")
add_arg('debug', bool, False, "Debug mode") add_arg('debug', bool, False, "Debug mode")
# SINGLE EVAL AND DRAW # SINGLE EVAL AND DRAW
add_arg('draw_threshold', float, 0.8, "Confidence threshold to draw bbox.") add_arg('draw_threshold', float, 0.8, "Confidence threshold to draw bbox.")
...@@ -128,4 +131,9 @@ def parse_args(): ...@@ -128,4 +131,9 @@ def parse_args():
add_arg('image_name', str, '', "The single image used to inference and visualize.") add_arg('image_name', str, '', "The single image used to inference and visualize.")
# yapf: enable # yapf: enable
args = parser.parse_args() args = parser.parse_args()
file_name = sys.argv[0]
if 'train' in file_name or 'profile' in file_name:
merge_cfg_from_args(args, 'train')
else:
merge_cfg_from_args(args, 'test')
return args return args
...@@ -12,8 +12,12 @@ ...@@ -12,8 +12,12 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import sys import sys
import os import os
import six
import argparse import argparse
import functools import functools
import matplotlib import matplotlib
...@@ -102,7 +106,7 @@ def train(args): ...@@ -102,7 +106,7 @@ def train(args):
noise_data = np.random.uniform( noise_data = np.random.uniform(
low=-1.0, high=1.0, low=-1.0, high=1.0,
size=[args.batch_size, NOISE_SIZE]).astype('float32') size=[args.batch_size, NOISE_SIZE]).astype('float32')
real_image = np.array(map(lambda x: x[0], data)).reshape( real_image = np.array(list(map(lambda x: x[0], data))).reshape(
-1, 784).astype('float32') -1, 784).astype('float32')
conditions_data = np.array([x[1] for x in data]).reshape( conditions_data = np.array([x[1] for x in data]).reshape(
[-1, 1]).astype("float32") [-1, 1]).astype("float32")
...@@ -138,7 +142,7 @@ def train(args): ...@@ -138,7 +142,7 @@ def train(args):
d_loss_np = [d_loss_1[0][0], d_loss_2[0][0]] d_loss_np = [d_loss_1[0][0], d_loss_2[0][0]]
for _ in xrange(NUM_TRAIN_TIMES_OF_DG): for _ in six.moves.xrange(NUM_TRAIN_TIMES_OF_DG):
noise_data = np.random.uniform( noise_data = np.random.uniform(
low=-1.0, high=1.0, low=-1.0, high=1.0,
size=[args.batch_size, NOISE_SIZE]).astype('float32') size=[args.batch_size, NOISE_SIZE]).astype('float32')
...@@ -159,8 +163,9 @@ def train(args): ...@@ -159,8 +163,9 @@ def train(args):
total_images = np.concatenate([real_image, generated_images]) total_images = np.concatenate([real_image, generated_images])
fig = plot(total_images) fig = plot(total_images)
msg = "Epoch ID={0}\n Batch ID={1}\n D-Loss={2}\n DG-Loss={3}\n gen={4}".format( msg = "Epoch ID={0}\n Batch ID={1}\n D-Loss={2}\n DG-Loss={3}\n gen={4}".format(
pass_id, batch_id, d_loss_np, dg_loss_np, pass_id, batch_id,
check(generated_images)) np.sum(d_loss_np),
np.sum(dg_loss_np), check(generated_images))
print(msg) print(msg)
plt.title(msg) plt.title(msg)
plt.savefig( plt.savefig(
......
...@@ -12,11 +12,15 @@ ...@@ -12,11 +12,15 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import sys import sys
import os import os
import argparse import argparse
import functools import functools
import matplotlib import matplotlib
import six
import numpy as np import numpy as np
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
...@@ -98,7 +102,7 @@ def train(args): ...@@ -98,7 +102,7 @@ def train(args):
noise_data = np.random.uniform( noise_data = np.random.uniform(
low=-1.0, high=1.0, low=-1.0, high=1.0,
size=[args.batch_size, NOISE_SIZE]).astype('float32') size=[args.batch_size, NOISE_SIZE]).astype('float32')
real_image = np.array(map(lambda x: x[0], data)).reshape( real_image = np.array(list(map(lambda x: x[0], data))).reshape(
-1, 784).astype('float32') -1, 784).astype('float32')
real_labels = np.ones( real_labels = np.ones(
shape=[real_image.shape[0], 1], dtype='float32') shape=[real_image.shape[0], 1], dtype='float32')
...@@ -128,7 +132,7 @@ def train(args): ...@@ -128,7 +132,7 @@ def train(args):
d_loss_np = [d_loss_1[0][0], d_loss_2[0][0]] d_loss_np = [d_loss_1[0][0], d_loss_2[0][0]]
for _ in xrange(NUM_TRAIN_TIMES_OF_DG): for _ in six.moves.xrange(NUM_TRAIN_TIMES_OF_DG):
noise_data = np.random.uniform( noise_data = np.random.uniform(
low=-1.0, high=1.0, low=-1.0, high=1.0,
size=[args.batch_size, NOISE_SIZE]).astype('float32') size=[args.batch_size, NOISE_SIZE]).astype('float32')
...@@ -146,7 +150,8 @@ def train(args): ...@@ -146,7 +150,8 @@ def train(args):
fig = plot(total_images) fig = plot(total_images)
msg = "Epoch ID={0} Batch ID={1} D-Loss={2} DG-Loss={3}\n gen={4}".format( msg = "Epoch ID={0} Batch ID={1} D-Loss={2} DG-Loss={3}\n gen={4}".format(
pass_id, batch_id, pass_id, batch_id,
np.sum(d_loss_np), dg_loss_np, check(generated_images)) np.sum(d_loss_np),
np.sum(dg_loss_np), check(generated_images))
print(msg) print(msg)
plt.title(msg) plt.title(msg)
plt.savefig( plt.savefig(
......
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
from utility import get_parent_function_name from utility import get_parent_function_name
...@@ -98,19 +101,19 @@ def D_cond(image, y): ...@@ -98,19 +101,19 @@ def D_cond(image, y):
h2 = bn(fc(h1, dfc_dim), act='leaky_relu') h2 = bn(fc(h1, dfc_dim), act='leaky_relu')
h2 = fluid.layers.concat([h2, y], 1) h2 = fluid.layers.concat([h2, y], 1)
h3 = fc(h2, 1) h3 = fc(h2, 1, act='sigmoid')
return h3 return h3
def G_cond(z, y): def G_cond(z, y):
s_h, s_w = output_height, output_width s_h, s_w = output_height, output_width
s_h2, s_h4 = int(s_h / 2), int(s_h / 4) s_h2, s_h4 = int(s_h // 2), int(s_h // 4)
s_w2, s_w4 = int(s_w / 2), int(s_w / 4) s_w2, s_w4 = int(s_w // 2), int(s_w // 4)
yb = fluid.layers.reshape(y, [-1, y_dim, 1, 1]) #NCHW yb = fluid.layers.reshape(y, [-1, y_dim, 1, 1]) #NCHW
z = fluid.layers.concat([z, y], 1) z = fluid.layers.concat([z, y], 1)
h0 = bn(fc(z, gfc_dim / 2), act='relu') h0 = bn(fc(z, gfc_dim // 2), act='relu')
h0 = fluid.layers.concat([h0, y], 1) h0 = fluid.layers.concat([h0, y], 1)
h1 = bn(fc(h0, gf_dim * 2 * s_h4 * s_w4), act='relu') h1 = bn(fc(h0, gf_dim * 2 * s_h4 * s_w4), act='relu')
...@@ -128,14 +131,14 @@ def D(x): ...@@ -128,14 +131,14 @@ def D(x):
x = conv(x, df_dim, act='leaky_relu') x = conv(x, df_dim, act='leaky_relu')
x = bn(conv(x, df_dim * 2), act='leaky_relu') x = bn(conv(x, df_dim * 2), act='leaky_relu')
x = bn(fc(x, dfc_dim), act='leaky_relu') x = bn(fc(x, dfc_dim), act='leaky_relu')
x = fc(x, 1, act=None) x = fc(x, 1, act='sigmoid')
return x return x
def G(x): def G(x):
x = bn(fc(x, gfc_dim)) x = bn(fc(x, gfc_dim))
x = bn(fc(x, gf_dim * 2 * img_dim / 4 * img_dim / 4)) x = bn(fc(x, gf_dim * 2 * img_dim // 4 * img_dim // 4))
x = fluid.layers.reshape(x, [-1, gf_dim * 2, img_dim / 4, img_dim / 4]) x = fluid.layers.reshape(x, [-1, gf_dim * 2, img_dim // 4, img_dim // 4])
x = deconv(x, gf_dim * 2, act='relu', output_size=[14, 14]) x = deconv(x, gf_dim * 2, act='relu', output_size=[14, 14])
x = deconv(x, 1, filter_size=5, padding=2, act='tanh', output_size=[28, 28]) x = deconv(x, 1, filter_size=5, padding=2, act='tanh', output_size=[28, 28])
x = fluid.layers.reshape(x, shape=[-1, 28 * 28]) x = fluid.layers.reshape(x, shape=[-1, 28 * 28])
......
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import math import math
import distutils.util import distutils.util
import numpy as np import numpy as np
import inspect import inspect
import matplotlib import matplotlib
import six
matplotlib.use('agg') matplotlib.use('agg')
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec import matplotlib.gridspec as gridspec
...@@ -54,7 +58,7 @@ def print_arguments(args): ...@@ -54,7 +58,7 @@ def print_arguments(args):
:type args: argparse.Namespace :type args: argparse.Namespace
""" """
print("----------- Configuration Arguments -----------") print("----------- Configuration Arguments -----------")
for arg, value in sorted(vars(args).iteritems()): for arg, value in sorted(six.iteritems(vars(args))):
print("%s: %s" % (arg, value)) print("%s: %s" % (arg, value))
print("------------------------------------------------") print("------------------------------------------------")
......
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os import os
from PIL import Image from PIL import Image
import numpy as np import numpy as np
from itertools import izip
A_LIST_FILE = "./data/horse2zebra/trainA.txt" A_LIST_FILE = "./data/horse2zebra/trainA.txt"
B_LIST_FILE = "./data/horse2zebra/trainB.txt" B_LIST_FILE = "./data/horse2zebra/trainB.txt"
...@@ -70,11 +72,3 @@ def b_test_reader(): ...@@ -70,11 +72,3 @@ def b_test_reader():
Reader of images with B style for test. Reader of images with B style for test.
""" """
return reader_creater(B_TEST_LIST_FILE, cycle=False, return_name=True) return reader_creater(B_TEST_LIST_FILE, cycle=False, return_name=True)
if __name__ == "__main__":
for A, B in izip(a_test_reader()(), a_test_reader()()):
print A[0].shape
print A[1]
print B[0].shape
print B[1]
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import data_reader import data_reader
import os import os
import random import random
...@@ -9,7 +12,6 @@ import paddle.fluid as fluid ...@@ -9,7 +12,6 @@ import paddle.fluid as fluid
import numpy as np import numpy as np
from paddle.fluid import core from paddle.fluid import core
from trainer import * from trainer import *
from itertools import izip
from scipy.misc import imsave from scipy.misc import imsave
import paddle.fluid.profiler as profiler import paddle.fluid.profiler as profiler
from utility import add_arguments, print_arguments, ImagePool from utility import add_arguments, print_arguments, ImagePool
...@@ -66,7 +68,7 @@ def train(args): ...@@ -66,7 +68,7 @@ def train(args):
if not os.path.exists(out_path): if not os.path.exists(out_path):
os.makedirs(out_path) os.makedirs(out_path)
i = 0 i = 0
for data_A, data_B in izip(A_test_reader(), B_test_reader()): for data_A, data_B in zip(A_test_reader(), B_test_reader()):
A_name = data_A[1] A_name = data_A[1]
B_name = data_B[1] B_name = data_B[1]
tensor_A = core.LoDTensor() tensor_A = core.LoDTensor()
...@@ -114,7 +116,7 @@ def train(args): ...@@ -114,7 +116,7 @@ def train(args):
exe, out_path + "/d_a", main_program=d_A_trainer.program) exe, out_path + "/d_a", main_program=d_A_trainer.program)
fluid.io.save_persistables( fluid.io.save_persistables(
exe, out_path + "/d_b", main_program=d_B_trainer.program) exe, out_path + "/d_b", main_program=d_B_trainer.program)
print "saved checkpoint to [%s]" % out_path print("saved checkpoint to {}".format(out_path))
sys.stdout.flush() sys.stdout.flush()
def init_model(): def init_model():
...@@ -128,7 +130,7 @@ def train(args): ...@@ -128,7 +130,7 @@ def train(args):
exe, args.init_model + "/d_a", main_program=d_A_trainer.program) exe, args.init_model + "/d_a", main_program=d_A_trainer.program)
fluid.io.load_persistables( fluid.io.load_persistables(
exe, args.init_model + "/d_b", main_program=d_B_trainer.program) exe, args.init_model + "/d_b", main_program=d_B_trainer.program)
print "Load model from [%s]" % args.init_model print("Load model from {}".format(args.init_model))
if args.init_model: if args.init_model:
init_model() init_model()
...@@ -136,8 +138,8 @@ def train(args): ...@@ -136,8 +138,8 @@ def train(args):
for epoch in range(args.epoch): for epoch in range(args.epoch):
batch_id = 0 batch_id = 0
for i in range(max_images_num): for i in range(max_images_num):
data_A = A_reader.next() data_A = next(A_reader)
data_B = B_reader.next() data_B = next(B_reader)
tensor_A = core.LoDTensor() tensor_A = core.LoDTensor()
tensor_B = core.LoDTensor() tensor_B = core.LoDTensor()
tensor_A.set(data_A, place) tensor_A.set(data_A, place)
...@@ -174,9 +176,9 @@ def train(args): ...@@ -174,9 +176,9 @@ def train(args):
feed={"input_A": tensor_A, feed={"input_A": tensor_A,
"fake_pool_A": fake_pool_A}) "fake_pool_A": fake_pool_A})
print "epoch[%d]; batch[%d]; g_A_loss: %s; d_B_loss: %s; g_B_loss: %s; d_A_loss: %s;" % ( print("epoch{}; batch{}; g_A_loss: {}; d_B_loss: {}; g_B_loss: {}; d_A_loss: {};".format(
epoch, batch_id, g_A_loss[0], d_B_loss[0], g_B_loss[0], epoch, batch_id, g_A_loss[0], d_B_loss[0], g_B_loss[0],
d_A_loss[0]) d_A_loss[0]))
sys.stdout.flush() sys.stdout.flush()
batch_id += 1 batch_id += 1
......
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from model import * from model import *
import paddle.fluid as fluid import paddle.fluid as fluid
......
...@@ -17,6 +17,7 @@ from __future__ import absolute_import ...@@ -17,6 +17,7 @@ from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
import distutils.util import distutils.util
import six
import random import random
import glob import glob
import numpy as np import numpy as np
...@@ -39,7 +40,7 @@ def print_arguments(args): ...@@ -39,7 +40,7 @@ def print_arguments(args):
:type args: argparse.Namespace :type args: argparse.Namespace
""" """
print("----------- Configuration Arguments -----------") print("----------- Configuration Arguments -----------")
for arg, value in sorted(vars(args).iteritems()): for arg, value in sorted(six.iteritems(vars(args))):
print("%s: %s" % (arg, value)) print("%s: %s" % (arg, value))
print("------------------------------------------------") print("------------------------------------------------")
......
...@@ -8,7 +8,7 @@ import os ...@@ -8,7 +8,7 @@ import os
import cv2 import cv2
import paddle.fluid as fluid import paddle.fluid as fluid
import paddle.v2 as paddle import paddle
from icnet import icnet from icnet import icnet
from utils import add_arguments, print_arguments, get_feeder_data from utils import add_arguments, print_arguments, get_feeder_data
from paddle.fluid.layers.learning_rate_scheduler import _decay_step_counter from paddle.fluid.layers.learning_rate_scheduler import _decay_step_counter
...@@ -111,10 +111,10 @@ def infer(args): ...@@ -111,10 +111,10 @@ def infer(args):
for line in open(args.images_list): for line in open(args.images_list):
image_file = args.images_path + "/" + line.strip() image_file = args.images_path + "/" + line.strip()
filename = os.path.basename(image_file) filename = os.path.basename(image_file)
image = paddle.image.load_image( image = paddle.dataset.image.load_image(
image_file, is_color=True).astype("float32") image_file, is_color=True).astype("float32")
image -= IMG_MEAN image -= IMG_MEAN
img = paddle.image.to_chw(image)[np.newaxis, :] img = paddle.dataset.image.to_chw(image)[np.newaxis, :]
image_t = fluid.core.LoDTensor() image_t = fluid.core.LoDTensor()
image_t.set(img, place) image_t.set(img, place)
result = exe.run(inference_program, result = exe.run(inference_program,
......
...@@ -17,6 +17,7 @@ from paddle.fluid.initializer import init_on_cpu ...@@ -17,6 +17,7 @@ from paddle.fluid.initializer import init_on_cpu
if 'ce_mode' in os.environ: if 'ce_mode' in os.environ:
np.random.seed(10) np.random.seed(10)
fluid.default_startup_program().random_seed = 90
parser = argparse.ArgumentParser(description=__doc__) parser = argparse.ArgumentParser(description=__doc__)
add_arg = functools.partial(add_arguments, argparser=parser) add_arg = functools.partial(add_arguments, argparser=parser)
...@@ -91,9 +92,6 @@ def train(args): ...@@ -91,9 +92,6 @@ def train(args):
place = fluid.CUDAPlace(0) place = fluid.CUDAPlace(0)
exe = fluid.Executor(place) exe = fluid.Executor(place)
if 'ce_mode' in os.environ:
fluid.default_startup_program().random_seed = 90
exe.run(fluid.default_startup_program()) exe.run(fluid.default_startup_program())
if args.init_model is not None: if args.init_model is not None:
...@@ -126,8 +124,9 @@ def train(args): ...@@ -126,8 +124,9 @@ def train(args):
sub124_loss += results[3] sub124_loss += results[3]
# training log # training log
if iter_id % LOG_PERIOD == 0: if iter_id % LOG_PERIOD == 0:
print("Iter[%d]; train loss: %.3f; sub4_loss: %.3f; sub24_loss: %.3f; sub124_loss: %.3f" % ( print(
iter_id, t_loss / LOG_PERIOD, sub4_loss / LOG_PERIOD, "Iter[%d]; train loss: %.3f; sub4_loss: %.3f; sub24_loss: %.3f; sub124_loss: %.3f"
% (iter_id, t_loss / LOG_PERIOD, sub4_loss / LOG_PERIOD,
sub24_loss / LOG_PERIOD, sub124_loss / LOG_PERIOD)) sub24_loss / LOG_PERIOD, sub124_loss / LOG_PERIOD))
print("kpis train_cost %f" % (t_loss / LOG_PERIOD)) print("kpis train_cost %f" % (t_loss / LOG_PERIOD))
......
...@@ -14,7 +14,7 @@ ...@@ -14,7 +14,7 @@
## 安装 ## 安装
在当前目录下运行样例代码需要PadddlePaddle Fluid的v0.13.0或以上的版本。如果你的运行环境中的PaddlePaddle低于此版本,请根据[安装文档](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/pip_install_cn.html)中的说明来更新PaddlePaddle。 在当前目录下运行样例代码需要PadddlePaddle Fluid的v0.13.0或以上的版本。如果你的运行环境中的PaddlePaddle低于此版本,请根据安装文档中的说明来更新PaddlePaddle。
## 数据准备 ## 数据准备
......
...@@ -19,7 +19,7 @@ test_acc_top1_kpi = AccKpi( ...@@ -19,7 +19,7 @@ test_acc_top1_kpi = AccKpi(
test_acc_top5_kpi = AccKpi( test_acc_top5_kpi = AccKpi(
'test_acc_top5', 0.02, 0, actived=True, desc='TOP5 ACC') 'test_acc_top5', 0.02, 0, actived=True, desc='TOP5 ACC')
test_cost_kpi = CostKpi('test_cost', 0.02, 0, actived=True, desc='train cost') test_cost_kpi = CostKpi('test_cost', 0.02, 0, actived=True, desc='train cost')
train_speed_kpi = AccKpi( train_speed_kpi = DurationKpi(
'train_speed', 'train_speed',
0.05, 0.05,
0, 0,
...@@ -38,7 +38,7 @@ test_acc_top5_card4_kpi = AccKpi( ...@@ -38,7 +38,7 @@ test_acc_top5_card4_kpi = AccKpi(
'test_acc_top5_card4', 0.02, 0, actived=True, desc='TOP5 ACC') 'test_acc_top5_card4', 0.02, 0, actived=True, desc='TOP5 ACC')
test_cost_card4_kpi = CostKpi( test_cost_card4_kpi = CostKpi(
'test_cost_card4', 0.02, 0, actived=True, desc='train cost') 'test_cost_card4', 0.02, 0, actived=True, desc='train cost')
train_speed_card4_kpi = AccKpi( train_speed_card4_kpi = DurationKpi(
'train_speed_card4', 'train_speed_card4',
0.05, 0.05,
0, 0,
......
...@@ -19,7 +19,7 @@ This tool is used to convert a Caffe model to a Fluid model ...@@ -19,7 +19,7 @@ This tool is used to convert a Caffe model to a Fluid model
- Download one from github directly - Download one from github directly
``` ```
cd proto/ && wget https://github.com/ethereon/caffe-tensorflow/blob/master/kaffe/caffe/caffepb.py cd proto/ && wget https://raw.githubusercontent.com/ethereon/caffe-tensorflow/master/kaffe/caffe/caffepb.py
``` ```
2. Convert the Caffe model to Fluid model 2. Convert the Caffe model to Fluid model
......
...@@ -8,7 +8,7 @@ import sys ...@@ -8,7 +8,7 @@ import sys
import os import os
import numpy as np import numpy as np
import paddle.fluid as fluid import paddle.fluid as fluid
import paddle.v2 as paddle import paddle
def test_model(exe, test_program, fetch_list, test_reader, feeder): def test_model(exe, test_program, fetch_list, test_reader, feeder):
......
...@@ -21,6 +21,11 @@ def parse_args(): ...@@ -21,6 +21,11 @@ def parse_args():
action='store_true', action='store_true',
help='If set, run \ help='If set, run \
the task with continuous evaluation logs.') the task with continuous evaluation logs.')
parser.add_argument(
'--num_devices',
type=int,
default=1,
help='Number of GPU devices')
args = parser.parse_args() args = parser.parse_args()
return args return args
...@@ -165,13 +170,13 @@ def train(train_reader, ...@@ -165,13 +170,13 @@ def train(train_reader,
print("finish training") print("finish training")
def get_cards(enable_ce): def get_cards(args):
if enable_ce: if args.enable_ce:
cards = os.environ.get('CUDA_VISIBLE_DEVICES') cards = os.environ.get('CUDA_VISIBLE_DEVICES')
num = len(cards.split(",")) num = len(cards.split(","))
return num return num
else: else:
return fluid.core.get_cuda_device_count() return args.num_devices
def train_net(): def train_net():
...@@ -179,7 +184,7 @@ def train_net(): ...@@ -179,7 +184,7 @@ def train_net():
batch_size = 20 batch_size = 20
args = parse_args() args = parse_args()
vocab, train_reader, test_reader = utils.prepare_data( vocab, train_reader, test_reader = utils.prepare_data(
batch_size=batch_size * get_cards(args.enable_ce), buffer_size=1000, \ batch_size=batch_size * get_cards(args), buffer_size=1000, \
word_freq_threshold=0, enable_ce = args.enable_ce) word_freq_threshold=0, enable_ce = args.enable_ce)
train( train(
train_reader=train_reader, train_reader=train_reader,
......
# Abstract
Dureader is an end-to-end neural network model for machine reading comprehension style question answering, which aims to answer questions from given passages. We first match the question and passages with a bidireactional attention flow network to obtrain the question-aware passages represenation. Then we employ a pointer network to locate the positions of answers from passages. Our experimental evalutions show that DuReader model achieves the state-of-the-art results in DuReader Dadaset.
# Dataset
DuReader Dataset is a new large-scale real-world and human sourced MRC dataset in Chinese. DuReader focuses on real-world open-domain question answering. The advantages of DuReader over existing datasets are concluded as follows:
- Real question
- Real article
- Real answer
- Real application scenario
- Rich annotation
# Network
DuReader model is inspired by 3 classic reading comprehension models([BiDAF](https://arxiv.org/abs/1611.01603), [Match-LSTM](https://arxiv.org/abs/1608.07905), [R-NET](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/r-net.pdf)).
DuReader model is a hierarchical multi-stage process and consists of five layers
- **Word Embedding Layer** maps each word to a vector using a pre-trained word embedding model.
- **Encoding Layer** extracts context infomation for each position in question and passages with a bi-directional LSTM network.
- **Attention Flow Layer** couples the query and context vectors and produces a set of query-aware feature vectors for each word in the context. Please refer to [BiDAF](https://arxiv.org/abs/1611.01603) for more details.
- **Fusion Layer** employs a layer of bi-directional LSTM to capture the interaction among context words independent of the query.
- **Decode Layer** employs an answer point network with attention pooling of the quesiton to locate the positions of answers from passages. Please refer to [Match-LSTM](https://arxiv.org/abs/1608.07905) and [R-NET](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/r-net.pdf) for more details.
## How to Run
### Download the Dataset
To Download DuReader dataset:
```
cd data && bash download.sh
```
For more details about DuReader dataset please refer to [DuReader Dataset Homepage](https://ai.baidu.com//broad/subordinate?dataset=dureader).
### Download Thirdparty Dependencies
We use Bleu and Rouge as evaluation metrics, the calculation of these metrics relies on the scoring scripts under [coco-caption](https://github.com/tylin/coco-caption), to download them, run:
```
cd utils && bash download_thirdparty.sh
```
### Environment Requirements
For now we've only tested on PaddlePaddle v1.0, to install PaddlePaddle and for more details about PaddlePaddle, see [PaddlePaddle Homepage](http://paddlepaddle.org).
### Preparation
Before training the model, we have to make sure that the data is ready. For preparation, we will check the data files, make directories and extract a vocabulary for later use. You can run the following command to do this with a specified task name:
```
sh run.sh --prepare
```
You can specify the files for train/dev/test by setting the `trainset`/`devset`/`testset`.
### Training
To train the model and you can also set the hyper-parameters such as the learning rate by using `--learning_rate NUM`. For example, to train the model for 10 passes, you can run:
```
sh run.sh --train --pass_num 10
```
The training process includes an evaluation on the dev set after each training epoch. By default, the model with the least Bleu-4 score on the dev set will be saved.
### Evaluation
To conduct a single evaluation on the dev set with the the model already trained, you can run the following command:
```
sh run.sh --evaluate --load_dir models/1
```
### Prediction
You can also predict answers for the samples in some files using the following command:
```
sh run.sh --predict --load_dir models/1 --testset ../data/preprocessed/testset/search.dev.json
```
By default, the results are saved at `../data/results/` folder. You can change this by specifying `--result_dir DIR_PATH`.
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import distutils.util
def parse_args():
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
'--prepare',
action='store_true',
help='create the directories, prepare the vocabulary and embeddings')
parser.add_argument('--train', action='store_true', help='train the model')
parser.add_argument(
'--evaluate', action='store_true', help='evaluate the model on dev set')
parser.add_argument(
'--predict',
action='store_true',
help='predict the answers for test set with trained model')
parser.add_argument(
"--embed_size",
type=int,
default=300,
help="The dimension of embedding table. (default: %(default)d)")
parser.add_argument(
"--hidden_size",
type=int,
default=300,
help="The size of rnn hidden unit. (default: %(default)d)")
parser.add_argument(
"--batch_size",
type=int,
default=32,
help="The sequence number of a mini-batch data. (default: %(default)d)")
parser.add_argument(
"--pass_num",
type=int,
default=5,
help="The pass number to train. (default: %(default)d)")
parser.add_argument(
"--learning_rate",
type=float,
default=0.001,
help="Learning rate used to train the model. (default: %(default)f)")
parser.add_argument(
"--weight_decay",
type=float,
default=0.0001,
help="Weight decay. (default: %(default)f)")
parser.add_argument(
"--use_gpu",
type=distutils.util.strtobool,
default=True,
help="Whether to use gpu. (default: %(default)d)")
parser.add_argument(
"--save_dir",
type=str,
default="model",
help="Specify the path to save trained models.")
parser.add_argument(
"--load_dir",
type=str,
default="",
help="Specify the path to load trained models.")
parser.add_argument(
"--save_interval",
type=int,
default=1,
help="Save the trained model every n passes."
"(default: %(default)d)")
parser.add_argument(
"--log_interval",
type=int,
default=50,
help="log the train loss every n batches."
"(default: %(default)d)")
parser.add_argument(
"--dev_interval",
type=int,
default=1000,
help="cal dev loss every n batches."
"(default: %(default)d)")
parser.add_argument('--optim', default='adam', help='optimizer type')
parser.add_argument('--trainset', nargs='+', help='train dataset')
parser.add_argument('--devset', nargs='+', help='dev dataset')
parser.add_argument('--testset', nargs='+', help='test dataset')
parser.add_argument('--vocab_dir', help='dict')
parser.add_argument('--max_p_num', type=int, default=5)
parser.add_argument('--max_a_len', type=int, default=200)
parser.add_argument('--max_p_len', type=int, default=500)
parser.add_argument('--max_q_len', type=int, default=9)
parser.add_argument('--doc_num', type=int, default=5)
parser.add_argument('--para_print', action='store_true')
parser.add_argument('--drop_rate', type=float, default=0.0)
parser.add_argument('--random_seed', type=int, default=123)
parser.add_argument(
'--log_path',
help='path of the log file. If not set, logs are printed to console')
parser.add_argument(
'--result_dir',
default='../data/results/',
help='the dir to output the results')
parser.add_argument(
'--result_name',
default='test_result',
help='the file name of the results')
args = parser.parse_args()
return args
#!/bin/bash
# ==============================================================================
# Copyright 2017 Baidu.com, Inc. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
if [[ -d preprocessed ]] && [[ -d raw ]]; then
echo "data exist"
exit 0
else
wget -c --no-check-certificate http://dureader.gz.bcebos.com/dureader_preprocessed.zip
fi
if md5sum --status -c md5sum.txt; then
unzip dureader_preprocessed.zip
else
echo "download data error!" >> /dev/stderr
exit 1
fi
7a4c28026f7dc94e8135d17203c63664 dureader_preprocessed.zip
# -*- coding:utf8 -*-
# ==============================================================================
# Copyright 2017 Baidu.com, Inc. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""
This module implements data process strategies.
"""
import os
import json
import logging
import numpy as np
from collections import Counter
class BRCDataset(object):
"""
This module implements the APIs for loading and using baidu reading comprehension dataset
"""
def __init__(self,
max_p_num,
max_p_len,
max_q_len,
train_files=[],
dev_files=[],
test_files=[]):
self.logger = logging.getLogger("brc")
self.max_p_num = max_p_num
self.max_p_len = max_p_len
self.max_q_len = max_q_len
self.train_set, self.dev_set, self.test_set = [], [], []
if train_files:
for train_file in train_files:
self.train_set += self._load_dataset(train_file, train=True)
self.logger.info('Train set size: {} questions.'.format(
len(self.train_set)))
if dev_files:
for dev_file in dev_files:
self.dev_set += self._load_dataset(dev_file)
self.logger.info('Dev set size: {} questions.'.format(
len(self.dev_set)))
if test_files:
for test_file in test_files:
self.test_set += self._load_dataset(test_file)
self.logger.info('Test set size: {} questions.'.format(
len(self.test_set)))
def _load_dataset(self, data_path, train=False):
"""
Loads the dataset
Args:
data_path: the data file to load
"""
with open(data_path) as fin:
data_set = []
for lidx, line in enumerate(fin):
sample = json.loads(line.strip())
if train:
if len(sample['answer_spans']) == 0:
continue
if sample['answer_spans'][0][1] >= self.max_p_len:
continue
if 'answer_docs' in sample:
sample['answer_passages'] = sample['answer_docs']
sample['question_tokens'] = sample['segmented_question']
sample['passages'] = []
for d_idx, doc in enumerate(sample['documents']):
if train:
most_related_para = doc['most_related_para']
sample['passages'].append({
'passage_tokens':
doc['segmented_paragraphs'][most_related_para],
'is_selected': doc['is_selected']
})
else:
para_infos = []
for para_tokens in doc['segmented_paragraphs']:
question_tokens = sample['segmented_question']
common_with_question = Counter(
para_tokens) & Counter(question_tokens)
correct_preds = sum(common_with_question.values())
if correct_preds == 0:
recall_wrt_question = 0
else:
recall_wrt_question = float(
correct_preds) / len(question_tokens)
para_infos.append((para_tokens, recall_wrt_question,
len(para_tokens)))
para_infos.sort(key=lambda x: (-x[1], x[2]))
fake_passage_tokens = []
for para_info in para_infos[:1]:
fake_passage_tokens += para_info[0]
sample['passages'].append({
'passage_tokens': fake_passage_tokens
})
data_set.append(sample)
return data_set
def _one_mini_batch(self, data, indices, pad_id):
"""
Get one mini batch
Args:
data: all data
indices: the indices of the samples to be selected
pad_id:
Returns:
one batch of data
"""
batch_data = {
'raw_data': [data[i] for i in indices],
'question_token_ids': [],
'question_length': [],
'passage_token_ids': [],
'passage_length': [],
'start_id': [],
'end_id': []
}
max_passage_num = max(
[len(sample['passages']) for sample in batch_data['raw_data']])
#max_passage_num = min(self.max_p_num, max_passage_num)
max_passage_num = self.max_p_num
for sidx, sample in enumerate(batch_data['raw_data']):
for pidx in range(max_passage_num):
if pidx < len(sample['passages']):
batch_data['question_token_ids'].append(sample[
'question_token_ids'])
batch_data['question_length'].append(
len(sample['question_token_ids']))
passage_token_ids = sample['passages'][pidx][
'passage_token_ids']
batch_data['passage_token_ids'].append(passage_token_ids)
batch_data['passage_length'].append(
min(len(passage_token_ids), self.max_p_len))
else:
batch_data['question_token_ids'].append([])
batch_data['question_length'].append(0)
batch_data['passage_token_ids'].append([])
batch_data['passage_length'].append(0)
batch_data, padded_p_len, padded_q_len = self._dynamic_padding(
batch_data, pad_id)
for sample in batch_data['raw_data']:
if 'answer_passages' in sample and len(sample['answer_passages']):
gold_passage_offset = padded_p_len * sample['answer_passages'][
0]
batch_data['start_id'].append(gold_passage_offset + sample[
'answer_spans'][0][0])
batch_data['end_id'].append(gold_passage_offset + sample[
'answer_spans'][0][1])
else:
# fake span for some samples, only valid for testing
batch_data['start_id'].append(0)
batch_data['end_id'].append(0)
return batch_data
def _dynamic_padding(self, batch_data, pad_id):
"""
Dynamically pads the batch_data with pad_id
"""
pad_p_len = min(self.max_p_len, max(batch_data['passage_length']))
pad_q_len = min(self.max_q_len, max(batch_data['question_length']))
batch_data['passage_token_ids'] = [
(ids + [pad_id] * (pad_p_len - len(ids)))[:pad_p_len]
for ids in batch_data['passage_token_ids']
]
batch_data['question_token_ids'] = [
(ids + [pad_id] * (pad_q_len - len(ids)))[:pad_q_len]
for ids in batch_data['question_token_ids']
]
return batch_data, pad_p_len, pad_q_len
def word_iter(self, set_name=None):
"""
Iterates over all the words in the dataset
Args:
set_name: if it is set, then the specific set will be used
Returns:
a generator
"""
if set_name is None:
data_set = self.train_set + self.dev_set + self.test_set
elif set_name == 'train':
data_set = self.train_set
elif set_name == 'dev':
data_set = self.dev_set
elif set_name == 'test':
data_set = self.test_set
else:
raise NotImplementedError('No data set named as {}'.format(
set_name))
if data_set is not None:
for sample in data_set:
for token in sample['question_tokens']:
yield token
for passage in sample['passages']:
for token in passage['passage_tokens']:
yield token
def convert_to_ids(self, vocab):
"""
Convert the question and passage in the original dataset to ids
Args:
vocab: the vocabulary on this dataset
"""
for data_set in [self.train_set, self.dev_set, self.test_set]:
if data_set is None:
continue
for sample in data_set:
sample['question_token_ids'] = vocab.convert_to_ids(sample[
'question_tokens'])
for passage in sample['passages']:
passage['passage_token_ids'] = vocab.convert_to_ids(passage[
'passage_tokens'])
def gen_mini_batches(self, set_name, batch_size, pad_id, shuffle=True):
"""
Generate data batches for a specific dataset (train/dev/test)
Args:
set_name: train/dev/test to indicate the set
batch_size: number of samples in one batch
pad_id: pad id
shuffle: if set to be true, the data is shuffled.
Returns:
a generator for all batches
"""
if set_name == 'train':
data = self.train_set
elif set_name == 'dev':
data = self.dev_set
elif set_name == 'test':
data = self.test_set
else:
raise NotImplementedError('No data set named as {}'.format(
set_name))
data_size = len(data)
indices = np.arange(data_size)
if shuffle:
np.random.shuffle(indices)
for batch_start in np.arange(0, data_size, batch_size):
batch_indices = indices[batch_start:batch_start + batch_size]
yield self._one_mini_batch(data, batch_indices, pad_id)
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import paddle.fluid.layers as layers
import paddle.fluid as fluid
import numpy as np
def dropout(input, args):
if args.drop_rate:
return layers.dropout(
input,
dropout_prob=args.drop_rate,
seed=args.random_seed,
is_test=False)
else:
return input
def bi_lstm_encoder(input_seq, gate_size, para_name, args):
# A bi-directional lstm encoder implementation.
# Linear transformation part for input gate, output gate, forget gate
# and cell activation vectors need be done outside of dynamic_lstm.
# So the output size is 4 times of gate_size.
input_forward_proj = layers.fc(
input=input_seq,
param_attr=fluid.ParamAttr(name=para_name + '_fw_gate_w'),
size=gate_size * 4,
act=None,
bias_attr=False)
input_reversed_proj = layers.fc(
input=input_seq,
param_attr=fluid.ParamAttr(name=para_name + '_bw_gate_w'),
size=gate_size * 4,
act=None,
bias_attr=False)
forward, _ = layers.dynamic_lstm(
input=input_forward_proj,
size=gate_size * 4,
use_peepholes=False,
param_attr=fluid.ParamAttr(name=para_name + '_fw_lstm_w'),
bias_attr=fluid.ParamAttr(name=para_name + '_fw_lstm_b'))
reversed, _ = layers.dynamic_lstm(
input=input_reversed_proj,
param_attr=fluid.ParamAttr(name=para_name + '_bw_lstm_w'),
bias_attr=fluid.ParamAttr(name=para_name + '_bw_lstm_b'),
size=gate_size * 4,
is_reverse=True,
use_peepholes=False)
encoder_out = layers.concat(input=[forward, reversed], axis=1)
return encoder_out
def encoder(input_name, para_name, shape, hidden_size, args):
input_ids = layers.data(
name=input_name, shape=[1], dtype='int64', lod_level=1)
input_embedding = layers.embedding(
input=input_ids,
size=shape,
dtype='float32',
is_sparse=True,
param_attr=fluid.ParamAttr(name='embedding_para'))
encoder_out = bi_lstm_encoder(
input_seq=input_embedding,
gate_size=hidden_size,
para_name=para_name,
args=args)
return dropout(encoder_out, args)
def attn_flow(q_enc, p_enc, p_ids_name, args):
tag = p_ids_name + "::"
drnn = layers.DynamicRNN()
with drnn.block():
h_cur = drnn.step_input(p_enc)
u_all = drnn.static_input(q_enc)
h_expd = layers.sequence_expand(x=h_cur, y=u_all)
s_t_mul = layers.elementwise_mul(x=u_all, y=h_expd, axis=0)
s_t_sum = layers.reduce_sum(input=s_t_mul, dim=1, keep_dim=True)
s_t_re = layers.reshape(s_t_sum, shape=[-1, 0])
s_t = layers.sequence_softmax(input=s_t_re)
u_expr = layers.elementwise_mul(x=u_all, y=s_t, axis=0)
u_expr = layers.sequence_pool(input=u_expr, pool_type='sum')
b_t = layers.sequence_pool(input=s_t_sum, pool_type='max')
drnn.output(u_expr, b_t)
U_expr, b = drnn()
b_norm = layers.sequence_softmax(input=b)
h_expr = layers.elementwise_mul(x=p_enc, y=b_norm, axis=0)
h_expr = layers.sequence_pool(input=h_expr, pool_type='sum')
H_expr = layers.sequence_expand(x=h_expr, y=p_enc)
H_expr = layers.lod_reset(x=H_expr, y=p_enc)
h_u = layers.elementwise_mul(x=p_enc, y=U_expr, axis=0)
h_h = layers.elementwise_mul(x=p_enc, y=H_expr, axis=0)
g = layers.concat(input=[p_enc, U_expr, h_u, h_h], axis=1)
return dropout(g, args)
def lstm_step(x_t, hidden_t_prev, cell_t_prev, size, para_name, args):
def linear(inputs, para_name, args):
return layers.fc(input=inputs,
size=size,
param_attr=fluid.ParamAttr(name=para_name + '_w'),
bias_attr=fluid.ParamAttr(name=para_name + '_b'))
input_cat = layers.concat([hidden_t_prev, x_t], axis=1)
forget_gate = layers.sigmoid(x=linear(input_cat, para_name + '_lstm_f',
args))
input_gate = layers.sigmoid(x=linear(input_cat, para_name + '_lstm_i',
args))
output_gate = layers.sigmoid(x=linear(input_cat, para_name + '_lstm_o',
args))
cell_tilde = layers.tanh(x=linear(input_cat, para_name + '_lstm_c', args))
cell_t = layers.sums(input=[
layers.elementwise_mul(
x=forget_gate, y=cell_t_prev), layers.elementwise_mul(
x=input_gate, y=cell_tilde)
])
hidden_t = layers.elementwise_mul(x=output_gate, y=layers.tanh(x=cell_t))
return hidden_t, cell_t
#point network
def point_network_decoder(p_vec, q_vec, hidden_size, args):
tag = 'pn_decoder:'
init_random = fluid.initializer.Normal(loc=0.0, scale=1.0)
random_attn = layers.create_parameter(
shape=[1, hidden_size],
dtype='float32',
default_initializer=init_random)
random_attn = layers.fc(
input=random_attn,
size=hidden_size,
act=None,
param_attr=fluid.ParamAttr(name=tag + 'random_attn_fc_w'),
bias_attr=fluid.ParamAttr(name=tag + 'random_attn_fc_b'))
random_attn = layers.reshape(random_attn, shape=[-1])
U = layers.fc(input=q_vec,
param_attr=fluid.ParamAttr(name=tag + 'q_vec_fc_w'),
bias_attr=False,
size=hidden_size,
act=None) + random_attn
U = layers.tanh(U)
logits = layers.fc(input=U,
param_attr=fluid.ParamAttr(name=tag + 'logits_fc_w'),
bias_attr=fluid.ParamAttr(name=tag + 'logits_fc_b'),
size=1,
act=None)
scores = layers.sequence_softmax(input=logits)
pooled_vec = layers.elementwise_mul(x=q_vec, y=scores, axis=0)
pooled_vec = layers.sequence_pool(input=pooled_vec, pool_type='sum')
init_state = layers.fc(
input=pooled_vec,
param_attr=fluid.ParamAttr(name=tag + 'init_state_fc_w'),
bias_attr=fluid.ParamAttr(name=tag + 'init_state_fc_b'),
size=hidden_size,
act=None)
def custom_dynamic_rnn(p_vec, init_state, hidden_size, para_name, args):
tag = para_name + "custom_dynamic_rnn:"
def static_rnn(step,
p_vec=p_vec,
init_state=None,
para_name='',
args=args):
tag = para_name + "static_rnn:"
ctx = layers.fc(
input=p_vec,
param_attr=fluid.ParamAttr(name=tag + 'context_fc_w'),
bias_attr=fluid.ParamAttr(name=tag + 'context_fc_b'),
size=hidden_size,
act=None)
beta = []
c_prev = init_state
m_prev = init_state
for i in range(step):
m_prev0 = layers.fc(
input=m_prev,
size=hidden_size,
act=None,
param_attr=fluid.ParamAttr(name=tag + 'm_prev0_fc_w'),
bias_attr=fluid.ParamAttr(name=tag + 'm_prev0_fc_b'))
m_prev1 = layers.sequence_expand(x=m_prev0, y=ctx)
Fk = ctx + m_prev1
Fk = layers.tanh(Fk)
logits = layers.fc(
input=Fk,
size=1,
act=None,
param_attr=fluid.ParamAttr(name=tag + 'logits_fc_w'),
bias_attr=fluid.ParamAttr(name=tag + 'logits_fc_b'))
scores = layers.sequence_softmax(input=logits)
attn_ctx = layers.elementwise_mul(x=p_vec, y=scores, axis=0)
attn_ctx = layers.sequence_pool(input=attn_ctx, pool_type='sum')
hidden_t, cell_t = lstm_step(
attn_ctx,
hidden_t_prev=m_prev,
cell_t_prev=c_prev,
size=hidden_size,
para_name=tag,
args=args)
m_prev = hidden_t
c_prev = cell_t
beta.append(scores)
return beta
return static_rnn(
2, p_vec=p_vec, init_state=init_state, para_name=para_name)
fw_outputs = custom_dynamic_rnn(p_vec, init_state, hidden_size, tag + "fw:",
args)
bw_outputs = custom_dynamic_rnn(p_vec, init_state, hidden_size, tag + "bw:",
args)
start_prob = layers.elementwise_add(
x=fw_outputs[0], y=bw_outputs[1], axis=0) / 2
end_prob = layers.elementwise_add(
x=fw_outputs[1], y=bw_outputs[0], axis=0) / 2
return start_prob, end_prob
def fusion(g, args):
m = bi_lstm_encoder(
input_seq=g, gate_size=args.hidden_size, para_name='fusion', args=args)
return dropout(m, args)
def rc_model(hidden_size, vocab, args):
emb_shape = [vocab.size(), vocab.embed_dim]
# stage 1:encode
p_ids_names = []
q_ids_names = []
ms = []
gs = []
qs = []
for i in range(args.doc_num):
p_ids_name = "pids_%d" % i
p_ids_names.append(p_ids_name)
p_enc_i = encoder(p_ids_name, 'p_enc', emb_shape, hidden_size, args)
q_ids_name = "qids_%d" % i
q_ids_names.append(q_ids_name)
q_enc_i = encoder(q_ids_name, 'q_enc', emb_shape, hidden_size, args)
# stage 2:match
g_i = attn_flow(q_enc_i, p_enc_i, p_ids_name, args)
# stage 3:fusion
m_i = fusion(g_i, args)
ms.append(m_i)
gs.append(g_i)
qs.append(q_enc_i)
m = layers.sequence_concat(input=ms)
g = layers.sequence_concat(input=gs)
q_vec = layers.sequence_concat(input=qs)
# stage 4:decode
start_probs, end_probs = point_network_decoder(
p_vec=m, q_vec=q_vec, hidden_size=hidden_size, args=args)
start_labels = layers.data(
name="start_lables", shape=[1], dtype='float32', lod_level=1)
end_labels = layers.data(
name="end_lables", shape=[1], dtype='float32', lod_level=1)
cost0 = layers.sequence_pool(
layers.cross_entropy(
input=start_probs, label=start_labels, soft_label=True),
'sum')
cost1 = layers.sequence_pool(
layers.cross_entropy(
input=end_probs, label=end_labels, soft_label=True),
'sum')
cost0 = layers.mean(cost0)
cost1 = layers.mean(cost1)
cost = cost0 + cost1
cost.persistable = True
feeding_list = q_ids_names + ["start_lables", "end_lables"] + p_ids_names
return cost, start_probs, end_probs, feeding_list
此差异已折叠。
export CUDA_VISIBLE_DEVICES=1
python run.py \
--trainset 'data/preprocessed/trainset/search.train.json' \
'data/preprocessed/trainset/zhidao.train.json' \
--devset 'data/preprocessed/devset/search.dev.json' \
'data/preprocessed/devset/zhidao.dev.json' \
--testset 'data/preprocessed/testset/search.test.json' \
'data/preprocessed/testset/zhidao.test.json' \
--vocab_dir 'data/vocab' \
--use_gpu true \
--save_dir ./models \
--pass_num 10 \
--learning_rate 0.001 \
--batch_size 8 \
--embed_size 300 \
--hidden_size 150 \
--max_p_num 5 \
--max_p_len 500 \
--max_q_len 60 \
--max_a_len 200 \
--drop_rate 0.2 $@\
# coding:utf8
# ==============================================================================
# Copyright 2017 Baidu.com, Inc. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""
This package implements some utility functions shared by PaddlePaddle
and Tensorflow model implementations.
Authors: liuyuan(liuyuan04@baidu.com)
Date: 2017/10/06 18:23:06
"""
from .dureader_eval import compute_bleu_rouge
from .dureader_eval import normalize
from .preprocess import find_fake_answer
from .preprocess import find_best_question_match
__all__ = [
'compute_bleu_rouge',
'normalize',
'find_fake_answer',
'find_best_question_match',
]
#!/bin/bash
# ==============================================================================
# Copyright 2017 Baidu.com, Inc. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
# We use Bleu and Rouge as evaluation metrics, the calculation of these metrics
# relies on the scoring scripts under "https://github.com/tylin/coco-caption"
bleu_base_url='https://raw.githubusercontent.com/tylin/coco-caption/master/pycocoevalcap/bleu'
bleu_files=("LICENSE" "__init__.py" "bleu.py" "bleu_scorer.py")
rouge_base_url="https://raw.githubusercontent.com/tylin/coco-caption/master/pycocoevalcap/rouge"
rouge_files=("__init__.py" "rouge.py")
download() {
local metric=$1; shift;
local base_url=$1; shift;
local fnames=($@);
mkdir -p ${metric}
for fname in ${fnames[@]};
do
printf "downloading: %s\n" ${base_url}/${fname}
wget --no-check-certificate ${base_url}/${fname} -O ${metric}/${fname}
done
}
# prepare rouge
download "rouge_metric" ${rouge_base_url} ${rouge_files[@]}
# prepare bleu
download "bleu_metric" ${bleu_base_url} ${bleu_files[@]}
# convert python 2.x source code to python 3.x
2to3 -w "../utils/bleu_metric/bleu_scorer.py"
2to3 -w "../utils/bleu_metric/bleu.py"
# -*- coding:utf8 -*-
# ==============================================================================
# Copyright 2017 Baidu.com, Inc. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""
This module computes evaluation metrics for DuReader dataset.
"""
import argparse
import json
import sys
import zipfile
from collections import Counter
from .bleu_metric.bleu import Bleu
from .rouge_metric.rouge import Rouge
EMPTY = ''
YESNO_LABELS = set(['Yes', 'No', 'Depends'])
def normalize(s):
"""
Normalize strings to space joined chars.
Args:
s: a list of strings.
Returns:
A list of normalized strings.
"""
if not s:
return s
normalized = []
for ss in s:
tokens = [c for c in list(ss) if len(c.strip()) != 0]
normalized.append(' '.join(tokens))
return normalized
def data_check(obj, task):
"""
Check data.
Raises:
Raises AssertionError when data is not legal.
"""
assert 'question_id' in obj, "Missing 'question_id' field."
assert 'question_type' in obj, \
"Missing 'question_type' field. question_id: {}".format(obj['question_type'])
assert 'yesno_answers' in obj, \
"Missing 'yesno_answers' field. question_id: {}".format(obj['question_id'])
assert isinstance(obj['yesno_answers'], list), \
r"""'yesno_answers' field must be a list, if the 'question_type' is not
'YES_NO', then this field should be an empty list.
question_id: {}""".format(obj['question_id'])
assert 'entity_answers' in obj, \
"Missing 'entity_answers' field. question_id: {}".format(obj['question_id'])
assert isinstance(obj['entity_answers'], list) \
and len(obj['entity_answers']) > 0, \
r"""'entity_answers' field must be a list, and has at least one element,
which can be a empty list. question_id: {}""".format(obj['question_id'])
def read_file(file_name, task, is_ref=False):
"""
Read predict answers or reference answers from file.
Args:
file_name: the name of the file containing predict result or reference
result.
Returns:
A dictionary mapping question_id to the result information. The result
information itself is also a dictionary with has four keys:
- question_type: type of the query.
- yesno_answers: A list of yesno answers corresponding to 'answers'.
- answers: A list of predicted answers.
- entity_answers: A list, each element is also a list containing the entities
tagged out from the corresponding answer string.
"""
def _open(file_name, mode, zip_obj=None):
if zip_obj is not None:
return zip_obj.open(file_name, mode)
return open(file_name, mode)
results = {}
keys = ['answers', 'yesno_answers', 'entity_answers', 'question_type']
if is_ref:
keys += ['source']
zf = zipfile.ZipFile(file_name, 'r') if file_name.endswith('.zip') else None
file_list = [file_name] if zf is None else zf.namelist()
for fn in file_list:
for line in _open(fn, 'r', zip_obj=zf):
try:
obj = json.loads(line.strip())
except ValueError:
raise ValueError("Every line of data should be legal json")
data_check(obj, task)
qid = obj['question_id']
assert qid not in results, "Duplicate question_id: {}".format(qid)
results[qid] = {}
for k in keys:
results[qid][k] = obj[k]
return results
def compute_bleu_rouge(pred_dict, ref_dict, bleu_order=4):
"""
Compute bleu and rouge scores.
"""
assert set(pred_dict.keys()) == set(ref_dict.keys()), \
"missing keys: {}".format(set(ref_dict.keys()) - set(pred_dict.keys()))
scores = {}
bleu_scores, _ = Bleu(bleu_order).compute_score(ref_dict, pred_dict)
for i, bleu_score in enumerate(bleu_scores):
scores['Bleu-%d' % (i + 1)] = bleu_score
rouge_score, _ = Rouge().compute_score(ref_dict, pred_dict)
scores['Rouge-L'] = rouge_score
return scores
def local_prf(pred_list, ref_list):
"""
Compute local precision recall and f1-score,
given only one prediction list and one reference list
"""
common = Counter(pred_list) & Counter(ref_list)
num_same = sum(common.values())
if num_same == 0:
return 0, 0, 0
p = 1.0 * num_same / len(pred_list)
r = 1.0 * num_same / len(ref_list)
f1 = (2 * p * r) / (p + r)
return p, r, f1
def compute_prf(pred_dict, ref_dict):
"""
Compute precision recall and f1-score.
"""
pred_question_ids = set(pred_dict.keys())
ref_question_ids = set(ref_dict.keys())
correct_preds, total_correct, total_preds = 0, 0, 0
for question_id in ref_question_ids:
pred_entity_list = pred_dict.get(question_id, [[]])
assert len(pred_entity_list) == 1, \
'the number of entity list for question_id {} is not 1.'.format(question_id)
pred_entity_list = pred_entity_list[0]
all_ref_entity_lists = ref_dict[question_id]
best_local_f1 = 0
best_ref_entity_list = None
for ref_entity_list in all_ref_entity_lists:
local_f1 = local_prf(pred_entity_list, ref_entity_list)[2]
if local_f1 > best_local_f1:
best_ref_entity_list = ref_entity_list
best_local_f1 = local_f1
if best_ref_entity_list is None:
if len(all_ref_entity_lists) > 0:
best_ref_entity_list = sorted(
all_ref_entity_lists, key=lambda x: len(x))[0]
else:
best_ref_entity_list = []
gold_entities = set(best_ref_entity_list)
pred_entities = set(pred_entity_list)
correct_preds += len(gold_entities & pred_entities)
total_preds += len(pred_entities)
total_correct += len(gold_entities)
p = float(correct_preds) / total_preds if correct_preds > 0 else 0
r = float(correct_preds) / total_correct if correct_preds > 0 else 0
f1 = 2 * p * r / (p + r) if correct_preds > 0 else 0
return {'Precision': p, 'Recall': r, 'F1': f1}
def prepare_prf(pred_dict, ref_dict):
"""
Prepares data for calculation of prf scores.
"""
preds = {k: v['entity_answers'] for k, v in pred_dict.items()}
refs = {k: v['entity_answers'] for k, v in ref_dict.items()}
return preds, refs
def filter_dict(result_dict, key_tag):
"""
Filter a subset of the result_dict, where keys ends with 'key_tag'.
"""
filtered = {}
for k, v in result_dict.items():
if k.endswith(key_tag):
filtered[k] = v
return filtered
def get_metrics(pred_result, ref_result, task, source):
"""
Computes metrics.
"""
metrics = {}
ref_result_filtered = {}
pred_result_filtered = {}
if source == 'both':
ref_result_filtered = ref_result
pred_result_filtered = pred_result
else:
for question_id, info in ref_result.items():
if info['source'] == source:
ref_result_filtered[question_id] = info
if question_id in pred_result:
pred_result_filtered[question_id] = pred_result[question_id]
if task == 'main' or task == 'all' \
or task == 'description':
pred_dict, ref_dict = prepare_bleu(pred_result_filtered,
ref_result_filtered, task)
metrics = compute_bleu_rouge(pred_dict, ref_dict)
elif task == 'yesno':
pred_dict, ref_dict = prepare_bleu(pred_result_filtered,
ref_result_filtered, task)
keys = ['Yes', 'No', 'Depends']
preds = [filter_dict(pred_dict, k) for k in keys]
refs = [filter_dict(ref_dict, k) for k in keys]
metrics = compute_bleu_rouge(pred_dict, ref_dict)
for k, pred, ref in zip(keys, preds, refs):
m = compute_bleu_rouge(pred, ref)
k_metric = [(k + '|' + key, v) for key, v in m.items()]
metrics.update(k_metric)
elif task == 'entity':
pred_dict, ref_dict = prepare_prf(pred_result_filtered,
ref_result_filtered)
pred_dict_bleu, ref_dict_bleu = prepare_bleu(pred_result_filtered,
ref_result_filtered, task)
metrics = compute_prf(pred_dict, ref_dict)
metrics.update(compute_bleu_rouge(pred_dict_bleu, ref_dict_bleu))
else:
raise ValueError("Illegal task name: {}".format(task))
return metrics
def prepare_bleu(pred_result, ref_result, task):
"""
Prepares data for calculation of bleu and rouge scores.
"""
pred_list, ref_list = [], []
qids = ref_result.keys()
for qid in qids:
if task == 'main':
pred, ref = get_main_result(qid, pred_result, ref_result)
elif task == 'yesno':
pred, ref = get_yesno_result(qid, pred_result, ref_result)
elif task == 'all':
pred, ref = get_all_result(qid, pred_result, ref_result)
elif task == 'entity':
pred, ref = get_entity_result(qid, pred_result, ref_result)
elif task == 'description':
pred, ref = get_desc_result(qid, pred_result, ref_result)
else:
raise ValueError("Illegal task name: {}".format(task))
if pred and ref:
pred_list += pred
ref_list += ref
pred_dict = dict(pred_list)
ref_dict = dict(ref_list)
for qid, ans in ref_dict.items():
ref_dict[qid] = normalize(ref_dict[qid])
pred_dict[qid] = normalize(pred_dict.get(qid, [EMPTY]))
if not ans or ans == [EMPTY]:
del ref_dict[qid]
del pred_dict[qid]
for k, v in pred_dict.items():
assert len(v) == 1, \
"There should be only one predict answer. question_id: {}".format(k)
return pred_dict, ref_dict
def get_main_result(qid, pred_result, ref_result):
"""
Prepare answers for task 'main'.
Args:
qid: question_id.
pred_result: A dict include all question_id's result information read
from args.pred_file.
ref_result: A dict incluce all question_id's result information read
from args.ref_file.
Returns:
Two lists, the first one contains predict result, the second
one contains reference result of the same question_id. Each list has
elements of tuple (question_id, answers), 'answers' is a list of strings.
"""
ref_ans = ref_result[qid]['answers']
if not ref_ans:
ref_ans = [EMPTY]
pred_ans = pred_result.get(qid, {}).get('answers', [])[:1]
if not pred_ans:
pred_ans = [EMPTY]
return [(qid, pred_ans)], [(qid, ref_ans)]
def get_entity_result(qid, pred_result, ref_result):
"""
Prepare answers for task 'entity'.
Args:
qid: question_id.
pred_result: A dict include all question_id's result information read
from args.pred_file.
ref_result: A dict incluce all question_id's result information read
from args.ref_file.
Returns:
Two lists, the first one contains predict result, the second
one contains reference result of the same question_id. Each list has
elements of tuple (question_id, answers), 'answers' is a list of strings.
"""
if ref_result[qid]['question_type'] != 'ENTITY':
return None, None
return get_main_result(qid, pred_result, ref_result)
def get_desc_result(qid, pred_result, ref_result):
"""
Prepare answers for task 'description'.
Args:
qid: question_id.
pred_result: A dict include all question_id's result information read
from args.pred_file.
ref_result: A dict incluce all question_id's result information read
from args.ref_file.
Returns:
Two lists, the first one contains predict result, the second
one contains reference result of the same question_id. Each list has
elements of tuple (question_id, answers), 'answers' is a list of strings.
"""
if ref_result[qid]['question_type'] != 'DESCRIPTION':
return None, None
return get_main_result(qid, pred_result, ref_result)
def get_yesno_result(qid, pred_result, ref_result):
"""
Prepare answers for task 'yesno'.
Args:
qid: question_id.
pred_result: A dict include all question_id's result information read
from args.pred_file.
ref_result: A dict incluce all question_id's result information read
from args.ref_file.
Returns:
Two lists, the first one contains predict result, the second
one contains reference result of the same question_id. Each list has
elements of tuple (question_id, answers), 'answers' is a list of strings.
"""
def _uniq(li, is_ref):
uniq_li = []
left = []
keys = set()
for k, v in li:
if k not in keys:
uniq_li.append((k, v))
keys.add(k)
else:
left.append((k, v))
if is_ref:
dict_li = dict(uniq_li)
for k, v in left:
dict_li[k] += v
uniq_li = [(k, v) for k, v in dict_li.items()]
return uniq_li
def _expand_result(uniq_li):
expanded = uniq_li[:]
keys = set([x[0] for x in uniq_li])
for k in YESNO_LABELS - keys:
expanded.append((k, [EMPTY]))
return expanded
def _get_yesno_ans(qid, result_dict, is_ref=False):
if qid not in result_dict:
return [(str(qid) + '_' + k, v) for k, v in _expand_result([])]
yesno_answers = result_dict[qid]['yesno_answers']
answers = result_dict[qid]['answers']
lbl_ans = _uniq([(k, [v]) for k, v in zip(yesno_answers, answers)],
is_ref)
ret = [(str(qid) + '_' + k, v) for k, v in _expand_result(lbl_ans)]
return ret
if ref_result[qid]['question_type'] != 'YES_NO':
return None, None
ref_ans = _get_yesno_ans(qid, ref_result, is_ref=True)
pred_ans = _get_yesno_ans(qid, pred_result)
return pred_ans, ref_ans
def get_all_result(qid, pred_result, ref_result):
"""
Prepare answers for task 'all'.
Args:
qid: question_id.
pred_result: A dict include all question_id's result information read
from args.pred_file.
ref_result: A dict incluce all question_id's result information read
from args.ref_file.
Returns:
Two lists, the first one contains predict result, the second
one contains reference result of the same question_id. Each list has
elements of tuple (question_id, answers), 'answers' is a list of strings.
"""
if ref_result[qid]['question_type'] == 'YES_NO':
return get_yesno_result(qid, pred_result, ref_result)
return get_main_result(qid, pred_result, ref_result)
def format_metrics(metrics, task, err_msg):
"""
Format metrics. 'err' field returns any error occured during evaluation.
Args:
metrics: A dict object contains metrics for different tasks.
task: Task name.
err_msg: Exception raised during evaluation.
Returns:
Formatted result.
"""
result = {}
sources = ["both", "search", "zhidao"]
if err_msg is not None:
return {'errorMsg': str(err_msg), 'errorCode': 1, 'data': []}
data = []
if task != 'all' and task != 'main':
sources = ["both"]
if task == 'entity':
metric_names = ["Bleu-4", "Rouge-L"]
metric_names_prf = ["F1", "Precision", "Recall"]
for name in metric_names + metric_names_prf:
for src in sources:
obj = {
"name": name,
"value": round(metrics[src].get(name, 0) * 100, 2),
"type": src,
}
data.append(obj)
elif task == 'yesno':
metric_names = ["Bleu-4", "Rouge-L"]
details = ["Yes", "No", "Depends"]
src = sources[0]
for name in metric_names:
obj = {
"name": name,
"value": round(metrics[src].get(name, 0) * 100, 2),
"type": 'All',
}
data.append(obj)
for d in details:
obj = {
"name": name,
"value": \
round(metrics[src].get(d + '|' + name, 0) * 100, 2),
"type": d,
}
data.append(obj)
else:
metric_names = ["Bleu-4", "Rouge-L"]
for name in metric_names:
for src in sources:
obj = {
"name": name,
"value": \
round(metrics[src].get(name, 0) * 100, 2),
"type": src,
}
data.append(obj)
result["data"] = data
result["errorCode"] = 0
result["errorMsg"] = "success"
return result
def main(args):
"""
Do evaluation.
"""
err = None
metrics = {}
try:
pred_result = read_file(args.pred_file, args.task)
ref_result = read_file(args.ref_file, args.task, is_ref=True)
sources = ['both', 'search', 'zhidao']
if args.task not in set(['main', 'all']):
sources = sources[:1]
for source in sources:
metrics[source] = get_metrics(pred_result, ref_result, args.task,
source)
except ValueError as ve:
err = ve
except AssertionError as ae:
err = ae
print(json.dumps(
format_metrics(metrics, args.task, err), ensure_ascii=False).encode(
'utf8'))
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('pred_file', help='predict file')
parser.add_argument('ref_file', help='reference file')
parser.add_argument(
'task', help='task name: Main|Yes_No|All|Entity|Description')
args = parser.parse_args()
args.task = args.task.lower().replace('_', '')
main(args)
# -*- coding:utf8 -*-
# ==============================================================================
# Copyright 2017 Baidu.com, Inc. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""
Utility function to generate vocabulary file.
"""
import argparse
import sys
import json
from itertools import chain
def get_vocab(files, vocab_file):
"""
Builds vocabulary file from field 'segmented_paragraphs'
and 'segmented_question'.
Args:
files: A list of file names.
vocab_file: The file that stores the vocabulary.
"""
vocab = {}
for f in files:
with open(f, 'r') as fin:
for line in fin:
obj = json.loads(line.strip())
paras = [
chain(*d['segmented_paragraphs']) for d in obj['documents']
]
doc_tokens = chain(*paras)
question_tokens = obj['segmented_question']
for t in list(doc_tokens) + question_tokens:
vocab[t] = vocab.get(t, 0) + 1
# output
sorted_vocab = sorted(
[(v, c) for v, c in vocab.items()], key=lambda x: x[1], reverse=True)
with open(vocab_file, 'w') as outf:
for w, c in sorted_vocab:
print >> outf, '{}\t{}'.format(w.encode('utf8'), c)
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument(
'--files',
nargs='+',
required=True,
help='file list to count vocab from.')
parser.add_argument(
'--vocab', required=True, help='file to store counted vocab.')
args = parser.parse_args()
get_vocab(args.files, args.vocab)
#coding=utf8
import os, sys, json
import nltk
def _nltk_tokenize(sequence):
tokens = nltk.word_tokenize(sequence)
cur_char_offset = 0
token_offsets = []
token_words = []
for token in tokens:
cur_char_offset = sequence.find(token, cur_char_offset)
token_offsets.append(
[cur_char_offset, cur_char_offset + len(token) - 1])
token_words.append(token)
return token_offsets, token_words
def segment(input_js):
_, input_js['segmented_question'] = _nltk_tokenize(input_js['question'])
for doc_id, doc in enumerate(input_js['documents']):
doc['segmented_title'] = []
doc['segmented_paragraphs'] = []
for para_id, para in enumerate(doc['paragraphs']):
_, seg_para = _nltk_tokenize(para)
doc['segmented_paragraphs'].append(seg_para)
if 'answers' in input_js:
input_js['segmented_answers'] = []
for answer_id, answer in enumerate(input_js['answers']):
_, seg_answer = _nltk_tokenize(answer)
input_js['segmented_answers'].append(seg_answer)
if __name__ == '__main__':
if len(sys.argv) != 2:
print('Usage: tokenize_data.py <input_path>')
exit()
nltk.download('punkt')
for line in open(sys.argv[1]):
dureader_js = json.loads(line.strip())
segment(dureader_js)
print(json.dumps(dureader_js))
#coding=utf8
import sys
import json
import pandas as pd
def trans(input_js):
output_js = {}
output_js['question'] = input_js['query']
output_js['question_type'] = input_js['query_type']
output_js['question_id'] = input_js['query_id']
output_js['fact_or_opinion'] = ""
output_js['documents'] = []
for para_id, para in enumerate(input_js['passages']):
doc = {}
doc['title'] = ""
if 'is_selected' in para:
doc['is_selected'] = True if para['is_selected'] != 0 else False
doc['paragraphs'] = [para['passage_text']]
output_js['documents'].append(doc)
if 'answers' in input_js:
output_js['answers'] = input_js['answers']
return output_js
if __name__ == '__main__':
if len(sys.argv) != 2:
print('Usage: marcov1_to_dureader.py <input_path>')
exit()
df = pd.read_json(sys.argv[1])
for row in df.iterrows():
marco_js = json.loads(row[1].to_json())
dureader_js = trans(marco_js)
print(json.dumps(dureader_js))
import sys
import json
import pandas as pd
if __name__ == '__main__':
if len(sys.argv) != 3:
print('Usage: tojson.py <input_path> <output_path>')
exit()
infile = sys.argv[1]
outfile = sys.argv[2]
df = pd.read_json(infile)
with open(outfile, 'w') as f:
for row in df.iterrows():
f.write(row[1].to_json() + '\n')
###############################################################################
# ==============================================================================
# Copyright 2017 Baidu.com, Inc. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""
This module finds the most related paragraph of each document according to recall.
"""
import sys
if sys.version[0] == '2':
reload(sys)
sys.setdefaultencoding("utf-8")
import json
from collections import Counter
def precision_recall_f1(prediction, ground_truth):
"""
This function calculates and returns the precision, recall and f1-score
Args:
prediction: prediction string or list to be matched
ground_truth: golden string or list reference
Returns:
floats of (p, r, f1)
Raises:
None
"""
if not isinstance(prediction, list):
prediction_tokens = prediction.split()
else:
prediction_tokens = prediction
if not isinstance(ground_truth, list):
ground_truth_tokens = ground_truth.split()
else:
ground_truth_tokens = ground_truth
common = Counter(prediction_tokens) & Counter(ground_truth_tokens)
num_same = sum(common.values())
if num_same == 0:
return 0, 0, 0
p = 1.0 * num_same / len(prediction_tokens)
r = 1.0 * num_same / len(ground_truth_tokens)
f1 = (2 * p * r) / (p + r)
return p, r, f1
def recall(prediction, ground_truth):
"""
This function calculates and returns the recall
Args:
prediction: prediction string or list to be matched
ground_truth: golden string or list reference
Returns:
floats of recall
Raises:
None
"""
return precision_recall_f1(prediction, ground_truth)[1]
def f1_score(prediction, ground_truth):
"""
This function calculates and returns the f1-score
Args:
prediction: prediction string or list to be matched
ground_truth: golden string or list reference
Returns:
floats of f1
Raises:
None
"""
return precision_recall_f1(prediction, ground_truth)[2]
def metric_max_over_ground_truths(metric_fn, prediction, ground_truths):
"""
This function calculates and returns the precision, recall and f1-score
Args:
metric_fn: metric function pointer which calculates scores according to corresponding logic.
prediction: prediction string or list to be matched
ground_truth: golden string or list reference
Returns:
floats of (p, r, f1)
Raises:
None
"""
scores_for_ground_truths = []
for ground_truth in ground_truths:
score = metric_fn(prediction, ground_truth)
scores_for_ground_truths.append(score)
return max(scores_for_ground_truths)
def find_best_question_match(doc, question, with_score=False):
"""
For each docment, find the paragraph that matches best to the question.
Args:
doc: The document object.
question: The question tokens.
with_score: If True then the match score will be returned,
otherwise False.
Returns:
The index of the best match paragraph, if with_score=False,
otherwise returns a tuple of the index of the best match paragraph
and the match score of that paragraph.
"""
most_related_para = -1
max_related_score = 0
most_related_para_len = 0
for p_idx, para_tokens in enumerate(doc['segmented_paragraphs']):
if len(question) > 0:
related_score = metric_max_over_ground_truths(recall, para_tokens,
question)
else:
related_score = 0
if related_score > max_related_score \
or (related_score == max_related_score \
and len(para_tokens) < most_related_para_len):
most_related_para = p_idx
max_related_score = related_score
most_related_para_len = len(para_tokens)
if most_related_para == -1:
most_related_para = 0
if with_score:
return most_related_para, max_related_score
return most_related_para
def find_fake_answer(sample):
"""
For each document, finds the most related paragraph based on recall,
then finds a span that maximize the f1_score compared with the gold answers
and uses this span as a fake answer span
Args:
sample: a sample in the dataset
Returns:
None
Raises:
None
"""
for doc in sample['documents']:
most_related_para = -1
most_related_para_len = 999999
max_related_score = 0
for p_idx, para_tokens in enumerate(doc['segmented_paragraphs']):
if len(sample['segmented_answers']) > 0:
related_score = metric_max_over_ground_truths(
recall, para_tokens, sample['segmented_answers'])
else:
continue
if related_score > max_related_score \
or (related_score == max_related_score
and len(para_tokens) < most_related_para_len):
most_related_para = p_idx
most_related_para_len = len(para_tokens)
max_related_score = related_score
doc['most_related_para'] = most_related_para
sample['answer_docs'] = []
sample['answer_spans'] = []
sample['fake_answers'] = []
sample['match_scores'] = []
best_match_score = 0
best_match_d_idx, best_match_span = -1, [-1, -1]
best_fake_answer = None
answer_tokens = set()
for segmented_answer in sample['segmented_answers']:
answer_tokens = answer_tokens | set(
[token for token in segmented_answer])
for d_idx, doc in enumerate(sample['documents']):
if not doc['is_selected']:
continue
if doc['most_related_para'] == -1:
doc['most_related_para'] = 0
most_related_para_tokens = doc['segmented_paragraphs'][doc[
'most_related_para']][:1000]
for start_tidx in range(len(most_related_para_tokens)):
if most_related_para_tokens[start_tidx] not in answer_tokens:
continue
for end_tidx in range(
len(most_related_para_tokens) - 1, start_tidx - 1, -1):
span_tokens = most_related_para_tokens[start_tidx:end_tidx + 1]
if len(sample['segmented_answers']) > 0:
match_score = metric_max_over_ground_truths(
f1_score, span_tokens, sample['segmented_answers'])
else:
match_score = 0
if match_score == 0:
break
if match_score > best_match_score:
best_match_d_idx = d_idx
best_match_span = [start_tidx, end_tidx]
best_match_score = match_score
best_fake_answer = ''.join(span_tokens)
if best_match_score > 0:
sample['answer_docs'].append(best_match_d_idx)
sample['answer_spans'].append(best_match_span)
sample['fake_answers'].append(best_fake_answer)
sample['match_scores'].append(best_match_score)
if __name__ == '__main__':
for line in sys.stdin:
sample = json.loads(line)
find_fake_answer(sample)
print(json.dumps(sample, encoding='utf8', ensure_ascii=False))
#!/bin/bash
input_file=$1
output_file=$2
# convert the data from MARCO V2 (json) format to MARCO V1 (jsonl) format.
# the script was forked from MARCO repo.
# the format of MARCO V1 is much more easier to explore.
python3 marcov2_to_v1_tojsonl.py $input_file $input_file.marcov1
# convert the data from MARCO V1 format to DuReader format.
python3 marcov1_to_dureader.py $input_file.marcov1 >$input_file.dureader_raw
# tokenize the data.
python3 marco_tokenize_data.py $input_file.dureader_raw >$input_file.segmented
# find fake answers (indicating the start and end positions of answers in the document) for train and dev sets.
# note that this should not be applied for test set, since there is no ground truth in test set.
python preprocess.py $input_file.segmented >$output_file
# remove the temporal data files.
rm -rf $input_file.dureader_raw $input_file.segmented
此差异已折叠。
import os import os
import math import math
import random import random
import cPickle
import functools import functools
import numpy as np import numpy as np
#import paddle.v2 as paddle
import paddle import paddle
from PIL import Image, ImageEnhance from PIL import Image, ImageEnhance
...@@ -45,9 +43,9 @@ for i, item in enumerate(test_list): ...@@ -45,9 +43,9 @@ for i, item in enumerate(test_list):
test_data[label] = [] test_data[label] = []
test_data[label].append(path) test_data[label].append(path)
print "train_data size:", len(train_data) print("train_data size:", len(train_data))
print "test_data size:", len(test_data) print("test_data size:", len(test_data))
print "test_data image number:", len(test_image_list) print("test_data image number:", len(test_image_list))
random.shuffle(test_image_list) random.shuffle(test_image_list)
...@@ -214,11 +212,11 @@ def eml_iterator(data, ...@@ -214,11 +212,11 @@ def eml_iterator(data,
color_jitter=False, color_jitter=False,
rotate=False): rotate=False):
def reader(): def reader():
labs = data.keys() labs = list(data.keys())
lab_num = len(labs) lab_num = len(labs)
ind = range(0, lab_num) ind = list(range(0, lab_num))
assert batch_size % samples_each_class == 0, "batch_size % samples_each_class != 0" assert batch_size % samples_each_class == 0, "batch_size % samples_each_class != 0"
num_class = batch_size/samples_each_class num_class = batch_size // samples_each_class
for i in range(iter_size): for i in range(iter_size):
random.shuffle(ind) random.shuffle(ind)
for n in range(num_class): for n in range(num_class):
...@@ -245,9 +243,9 @@ def quadruplet_iterator(data, ...@@ -245,9 +243,9 @@ def quadruplet_iterator(data,
color_jitter=False, color_jitter=False,
rotate=False): rotate=False):
def reader(): def reader():
labs = data.keys() labs = list(data.keys())
lab_num = len(labs) lab_num = len(labs)
ind = range(0, lab_num) ind = list(range(0, lab_num))
for i in range(iter_size): for i in range(iter_size):
random.shuffle(ind) random.shuffle(ind)
ind_sample = ind[:class_num] ind_sample = ind[:class_num]
...@@ -255,7 +253,7 @@ def quadruplet_iterator(data, ...@@ -255,7 +253,7 @@ def quadruplet_iterator(data,
for ind_i in ind_sample: for ind_i in ind_sample:
lab = labs[ind_i] lab = labs[ind_i]
data_list = data[lab] data_list = data[lab]
data_ind = range(0, len(data_list)) data_ind = list(range(0, len(data_list)))
random.shuffle(data_ind) random.shuffle(data_ind)
anchor_ind = data_ind[:samples_each_class] anchor_ind = data_ind[:samples_each_class]
...@@ -277,15 +275,15 @@ def triplet_iterator(data, ...@@ -277,15 +275,15 @@ def triplet_iterator(data,
color_jitter=False, color_jitter=False,
rotate=False): rotate=False):
def reader(): def reader():
labs = data.keys() labs = list(data.keys())
lab_num = len(labs) lab_num = len(labs)
ind = range(0, lab_num) ind = list(range(0, lab_num))
for i in range(iter_size): for i in range(iter_size):
random.shuffle(ind) random.shuffle(ind)
ind_pos, ind_neg = ind[:2] ind_pos, ind_neg = ind[:2]
lab_pos = labs[ind_pos] lab_pos = labs[ind_pos]
pos_data_list = data[lab_pos] pos_data_list = data[lab_pos]
data_ind = range(0, len(pos_data_list)) data_ind = list(range(0, len(pos_data_list)))
random.shuffle(data_ind) random.shuffle(data_ind)
anchor_ind, pos_ind = data_ind[:2] anchor_ind, pos_ind = data_ind[:2]
...@@ -346,7 +344,7 @@ def quadruplet_train(class_num, samples_each_class): ...@@ -346,7 +344,7 @@ def quadruplet_train(class_num, samples_each_class):
def triplet_train(batch_size): def triplet_train(batch_size):
assert(batch_size % 3 == 0) assert(batch_size % 3 == 0)
return triplet_iterator(train_data, 'train', batch_size, iter_size = batch_size/3 * 100, \ return triplet_iterator(train_data, 'train', batch_size, iter_size = batch_size//3 * 100, \
shuffle=True, color_jitter=False, rotate=False) shuffle=True, color_jitter=False, rotate=False)
def test(): def test():
......
import datareader as reader
import math import math
import numpy as np import numpy as np
import paddle.fluid as fluid import paddle.fluid as fluid
from metrics import calculate_order_dist_matrix from . import datareader as reader
from metrics import get_gpu_num from .metrics import calculate_order_dist_matrix
from .metrics import get_gpu_num
class emlloss(): class emlloss():
def __init__(self, train_batch_size = 40, samples_each_class=2): def __init__(self, train_batch_size = 40, samples_each_class=2):
...@@ -11,9 +11,9 @@ class emlloss(): ...@@ -11,9 +11,9 @@ class emlloss():
self.samples_each_class = samples_each_class self.samples_each_class = samples_each_class
self.train_batch_size = train_batch_size self.train_batch_size = train_batch_size
assert(train_batch_size % num_gpus == 0) assert(train_batch_size % num_gpus == 0)
self.cal_loss_batch_size = train_batch_size / num_gpus self.cal_loss_batch_size = train_batch_size // num_gpus
assert(self.cal_loss_batch_size % samples_each_class == 0) assert(self.cal_loss_batch_size % samples_each_class == 0)
class_num = train_batch_size / samples_each_class class_num = train_batch_size // samples_each_class
self.train_reader = reader.eml_train(train_batch_size, samples_each_class) self.train_reader = reader.eml_train(train_batch_size, samples_each_class)
self.test_reader = reader.test() self.test_reader = reader.test()
......
import datareader as reader from . import datareader as reader
import paddle.fluid as fluid import paddle.fluid as fluid
class tripletloss(): class tripletloss():
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册