未验证 提交 dd8bdef7 编写于 作者: C Cheerego 提交者: GitHub

routine update 0406 (#755)

* fix conv3d Chinese (#729)

* fix typo (#751)

* update models link (#753)

* update models link

* fix en

* update title

* fix cn title

* follow comment

* update_0405 (#754)
上级 8883da8b
......@@ -890,7 +890,7 @@ ExecutionStrategy
.. py:attribute:: num_iteration_per_drop_scope
int型成员。它表明了清空执行时产生的临时变量需要的程序执行重复次数。因为临时变量的形可能在两次重复过程中保持一致,所以它会使整体执行过程更快。默认值为100。
int型成员。它表明了清空执行时产生的临时变量需要的程序执行重复次数。因为临时变量的形可能在两次重复过程中保持一致,所以它会使整体执行过程更快。默认值为100。
.. note::
1. 如果在调用 ``run`` 方法时获取结果数据,``ParallelExecutor`` 会在当前程序重复执行尾部清空临时变量
......
......@@ -2293,7 +2293,7 @@ conv3d
.. py:function:: paddle.fluid.layers.conv3d(input, num_filters, filter_size, stride=1, padding=0, dilation=1, groups=None, param_attr=None, bias_attr=None, use_cudnn=True, act=None, name=None)
卷积三维层(convolution3D layer)根据输入、滤波器(filter)、步长(stride)、填充(padding)、膨胀(dilations)、组数参数计算得到输出。输入和输出是NCHW格式,N是批尺寸,C是通道数,H是特征高度,W是特征宽度。卷积三维(Convlution3D)和卷积二维(Convlution2D)相似,但多了一维深度(depth)。如果提供了bias属性和激活函数类型,bias会添加到卷积(convolution)的结果中相应的激活函数会作用在最终结果上。
3D卷积层(convolution3D layer)根据输入、滤波器(filter)、步长(stride)、填充(padding)、膨胀(dilations)、组数参数计算得到输出。输入和输出是NCHW格式,N是批尺寸,C是通道数,H是特征高度,W是特征宽度。卷积三维(Convlution3D)和卷积二维(Convlution2D)相似,但多了一维深度(depth)。如果提供了bias属性和激活函数类型,bias会添加到卷积(convolution)的结果中相应的激活函数会作用在最终结果上。
对每个输入X,有等式:
......@@ -2303,8 +2303,8 @@ conv3d
Out = \sigma \left ( W * X + b \right )
其中:
- :math:`X` :输入值,NCHW格式的张量(Tensor)
- :math:`W` :滤波器值,MCHW格式的张量(Tensor)
- :math:`X` :输入值,NCDHW格式的张量(Tensor)
- :math:`W` :滤波器值,MCDHW格式的张量(Tensor)
- :math:`*` : 卷积操作
- :math:`b` :Bias值,二维张量(Tensor),形为 ``[M,1]``
- :math:`\sigma` :激活函数
......@@ -2313,30 +2313,28 @@ conv3d
**示例**
- 输入:
输入shape: :math:`( N,C_{in},H_{in},W_{in} )`
输入shape: :math:`(N, C_{in}, D_{in}, H_{in}, W_{in})`
滤波器shape: :math:`( C_{out},C_{in},H_{f},W_{f} )`
滤波器shape: :math:`(C_{out}, C_{in}, D_f, H_f, W_f)`
- 输出:
输出shape: :math:`( N,C_{out},H_{out},W_{out} )`
输出shape: :math:`(N, C_{out}, D_{out}, H_{out}, W_{out})`
其中
.. math::
D_{out} = \frac{\left ( D_{in}+2*paddings[0]-\left ( dilations[0]*\left ( D_{f}-1 \right )+1 \right ) \right )}{strides[0]}+1
H_{out} = \frac{\left ( H_{in}+2*paddings[1]-\left ( dilations[1]*\left ( H_{f}-1 \right )+1 \right ) \right )}{strides[1]}+1
W_{out} = \frac{\left ( W_{in}+2*paddings[2]-\left ( dilations[2]*\left ( W_{f}-1 \right )+1 \right ) \right )}{strides[2]}+1
D_{out}&= \frac{(D_{in} + 2 * paddings[0] - (dilations[0] * (D_f - 1) + 1))}{strides[0]} + 1 \\
H_{out}&= \frac{(H_{in} + 2 * paddings[1] - (dilations[1] * (H_f - 1) + 1))}{strides[1]} + 1 \\
W_{out}&= \frac{(W_{in} + 2 * paddings[2] - (dilations[2] * (W_f - 1) + 1))}{strides[2]} + 1
参数:
- **input** (Variable) - 格式为[N,C,H,W]格式的输入图像
- **input** (Variable) - 格式为[N,C,D,H,W]格式的输入图像
- **num_fliters** (int) - 滤波器数。和输出图像通道相同
- **filter_size** (int|tuple|None) - 滤波器大小。如果filter_size是一个元组,则必须包含两个整型数,(filter_size,filter_size_W)。否则,滤波器为square
- **stride** (int|tuple) - 步长(stride)大小。如果步长(stride)为元组,则必须包含两个整型数,(stride_H,stride_W)。否则,stride_H = stride_W = stride。默认:stride = 1
- **padding** (int|tuple) - 填充(padding)大小。如果填充(padding)为元组,则必须包含两个整型数,(padding_H,padding_W)。否则,padding_H = padding_W = padding。默认:padding = 0
- **dilation** (int|tuple) - 膨胀(dilation)大小。如果膨胀(dialation)为元组,则必须包含两个整型数,(dilation_H,dilation_W)。否则,dilation_H = dilation_W = dilation。默认:dilation = 1
- **filter_size** (int|tuple|None) - 滤波器大小。如果filter_size是一个元组,则必须包含三个整型数,(filter_size_D, filter_size_H, filter_size_W)。否则,滤波器为棱长为int的立方体形。
- **stride** (int|tuple) - 步长(stride)大小。如果步长(stride)为元组,则必须包含三个整型数, (stride_D, stride_H, stride_W)。否则,stride_D = stride_H = stride_W = stride。默认:stride = 1
- **padding** (int|tuple) - 填充(padding)大小。如果填充(padding)为元组,则必须包含三个整型数,(padding_D, padding_H, padding_W)。否则, padding_D = padding_H = padding_W = padding。默认:padding = 0
- **dilation** (int|tuple) - 膨胀(dilation)大小。如果膨胀(dialation)为元组,则必须包含两个整型数, (dilation_D, dilation_H, dilation_W)。否则,dilation_D = dilation_H = dilation_W = dilation。默认:dilation = 1
- **groups** (int) - 卷积二维层(Conv2D Layer)的组数。根据Alex Krizhevsky的深度卷积神经网络(CNN)论文中的成组卷积:当group=2,滤波器的前一半仅和输入通道的前一半连接。滤波器的后一半仅和输入通道的后一半连接。默认:groups = 1
- **param_attr** (ParamAttr|None) - conv2d的可学习参数/权重的参数属性。如果设为None或者ParamAttr的一个属性,conv2d创建ParamAttr为param_attr。如果param_attr的初始化函数未设置,参数则初始化为 :math:`Normal(0.0,std)`,并且std为 :math:`\left ( \frac{2.0}{filter\_elem\_num} \right )^{0.5}` 。默认为None
- **bias_attr** (ParamAttr|bool|None) - conv2d bias的参数属性。如果设为False,则没有bias加到输出。如果设为None或者ParamAttr的一个属性,conv2d创建ParamAttr为bias_attr。如果bias_attr的初始化函数未设置,bias初始化为0.默认为None
......
......@@ -20,9 +20,13 @@ PaddlePaddle目前支持以下环境:
:code:`pip install paddlepaddle-gpu` (GPU版本最新)
:code:`pip install paddlepaddle==[pip版本号]`
注::code:`pip install paddlepaddle-gpu` 命令将安装支持CUDA 9.0 cuDNN v7的PaddlePaddle,如果您的CUDA或cuDNN版本与此不同,可以参考 `这里 <https://pypi.org/project/paddlepaddle-gpu/#history>`_ 了解其他CUDA/cuDNN版本所适用的安装命令
其中[pip版本号]请查阅 `PyPi.org <https://pypi.org/search/?q=PaddlePaddle>`_
如果您希望通过 ``pip`` 方式安装老版本的PaddlePaddle,您可以使用如下命令:
:code:`pip install paddlepaddle==[PaddlePaddle版本号]` (CPU版,具体版本号请参考 `这里 <https://pypi.org/project/paddlepaddle/#history>`_ )
:code:`pip install paddlepaddle-gpu==[PaddlePaddle版本号]` (GPU版,具体版本号请参考 `这里 <https://pypi.org/project/paddlepaddle-gpu/#history>`_ )
- 如果您希望使用 `docker <https://www.docker.com>`_ 安装PaddlePaddle可以直接使用以下命令:
:code:`docker run --name [Name of container] -it -v $PWD:/paddle hub.baidubce.com/paddlepaddle/paddle:[docker版本号] /bin/bash`
......
`Fluid 模型库 <https://github.com/PaddlePaddle/models/tree/develop/fluid>`__
`Fluid 模型库 <https://github.com/PaddlePaddle/models>`__
============
图像分类
......@@ -8,21 +8,21 @@
在深度学习时代,图像分类的准确率大幅度提升,在图像分类任务中,我们向大家介绍了如何在经典的数据集ImageNet上,训练常用的模型,包括AlexNet、VGG、GoogLeNet、ResNet、Inception-v4、MobileNet、DPN(Dual
Path
Network)、SE-ResNeXt模型,也开源了\ `训练的模型 <https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleCV/image_classification/README_cn.md#已有模型及其性能>`__\ 方便用户下载使用。同时提供了能够将Caffe模型转换为PaddlePaddle
Network)、SE-ResNeXt模型,也开源了\ `训练的模型 <https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/image_classification/README_cn.md#已有模型及其性能>`__\ 方便用户下载使用。同时提供了能够将Caffe模型转换为PaddlePaddle
Fluid模型配置和参数文件的工具。
- `AlexNet <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `VGG <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `GoogleNet <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `AlexNet <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `VGG <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `GoogleNet <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `Residual
Network <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `Inception-v4 <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `MobileNet <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
Network <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `Inception-v4 <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `MobileNet <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `Dual Path
Network <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `SE-ResNeXt <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
Network <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `SE-ResNeXt <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `Caffe模型转换为Paddle
Fluid配置和模型文件工具 <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/caffe2fluid>`__
Fluid配置和模型文件工具 <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/caffe2fluid>`__
目标检测
--------
......@@ -33,14 +33,11 @@ Fluid模型配置和参数文件的工具。
VOC <http://host.robots.ox.ac.uk/pascal/VOC/>`__\ 、\ `MS
COCO <http://cocodataset.org/#home>`__\ 数据训练通用物体检测模型,当前介绍了SSD算法,SSD全称Single Shot MultiBox Detector,是目标检测领域较新且效果较好的检测算法之一,具有检测速度快且检测精度高的特点。
开放环境中的检测人脸,尤其是小的、模糊的和部分遮挡的人脸也是一个具有挑战的任务。我们也介绍了如何基于 `WIDER FACE <http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/>`_ 数据训练百度自研的人脸检测PyramidBox模型,该算法于2018年3月份在WIDER FACE的多项评测中均获得 `第一名 <http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/WiderFace_Results.html>`_ 。
RCNN系列模型是典型的两阶段目标检测器,相较于传统提取区域的方法,RCNN中RPN网络通过共享卷积层参数大幅提高提取区域的效率,并提出高质量的候选区域。其中典型模型包括Faster RCNN和Mask RCNN。
开放环境中的检测人脸,尤其是小的、模糊的和部分遮挡的人脸也是一个具有挑战的任务。我们也介绍了如何基于 `WIDER FACE <http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/>`_ 数据训练百度自研的人脸检测PyramidBox模型,该算法于2018年3月份在WIDER FACE的多项评测中均获得 `第一名 <http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/WiderFace_Results.html>`_。
- `Single Shot MultiBox
Detector <https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleCV/object_detection/README_cn.md>`__
- `Face Detector: PyramidBox <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/face_detection/README_cn.md>`_
- `RCNN <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/rcnn/README_cn.md>`_
Detector <https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/object_detection/README_cn.md>`__
- `Face Detector: PyramidBox <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/face_detection/README_cn.md>`_
图像语义分割
------------
......@@ -50,7 +47,7 @@ RCNN系列模型是典型的两阶段目标检测器,相较于传统提取区
在图像语义分割任务中,我们介绍如何基于图像级联网络(Image Cascade
Network,ICNet)进行语义分割,相比其他分割算法,ICNet兼顾了准确率和速度。
- `ICNet <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/icnet>`__
- `ICNet <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/icnet>`__
图像生成
-----------
......@@ -60,8 +57,8 @@ Network,ICNet)进行语义分割,相比其他分割算法,ICNet兼顾了准
在图像生成任务中,我们介绍了如何使用DCGAN和ConditioanlGAN来进行手写数字的生成,另外还介绍了用于风格迁移的CycleGAN.
- `DCGAN & ConditionalGAN <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/gan/c_gan>`__
- `CycleGAN <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/gan/cycle_gan>`__
- `DCGAN & ConditionalGAN <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/gan/c_gan>`__
- `CycleGAN <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/gan/cycle_gan>`__
场景文字识别
------------
......@@ -70,8 +67,8 @@ Network,ICNet)进行语义分割,相比其他分割算法,ICNet兼顾了准
在场景文字识别任务中,我们介绍如何将基于CNN的图像特征提取和基于RNN的序列翻译技术结合,免除人工定义特征,避免字符分割,使用自动学习到的图像特征,完成字符识别。当前,介绍了CRNN-CTC模型和基于注意力机制的序列到序列模型。
- `CRNN-CTC模型 <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/ocr_recognition>`__
- `Attention模型 <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/ocr_recognition>`__
- `CRNN-CTC模型 <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/ocr_recognition>`__
- `Attention模型 <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/ocr_recognition>`__
度量学习
......@@ -80,7 +77,7 @@ Network,ICNet)进行语义分割,相比其他分割算法,ICNet兼顾了准
度量学习也称作距离度量学习、相似度学习,通过学习对象之间的距离,度量学习能够用于分析对象时间的关联、比较关系,在实际问题中应用较为广泛,可应用于辅助分类、聚类问题,也广泛用于图像检索、人脸识别等领域。以往,针对不同的任务,需要选择合适的特征并手动构建距离函数,而度量学习可根据不同的任务来自主学习出针对特定任务的度量距离函数。度量学习和深度学习的结合,在人脸识别/验证、行人再识别(human Re-ID)、图像检索等领域均取得较好的性能,在这个任务中我们主要介绍了基于Fluid的深度度量学习模型,包含了三元组、四元组等损失函数。
- `Metric Learning <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/metric_learning>`__
- `Metric Learning <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/metric_learning>`__
视频分类
......@@ -89,7 +86,7 @@ Network,ICNet)进行语义分割,相比其他分割算法,ICNet兼顾了准
视频分类是视频理解任务的基础,与图像分类不同的是,分类的对象不再是静止的图像,而是一个由多帧图像构成的、包含语音数据、包含运动信息等的视频对象,因此理解视频需要获得更多的上下文信息,不仅要理解每帧图像是什么、包含什么,还需要结合不同帧,知道上下文的关联信息。视频分类方法主要包含基于卷积神经网络、基于循环神经网络、或将这两者结合的方法。该任务中我们介绍基于Fluid的视频分类模型,目前包含Temporal Segment Network(TSN)模型,后续会持续增加更多模型。
- `TSN <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/video_classification>`__
- `TSN <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/video_classification>`__
......@@ -106,7 +103,7 @@ ASR
中深度学习模型端到端直接预测字词的分布不同,本实例更接近传统的语言识别流程,以音素为建模单元,关注语言识别中声学模型的训练,利用\ `kaldi <http://www.kaldi-asr.org>`__\ 进行音频数据的特征提取和标签对齐,并集成
kaldi 的解码器完成解码。
- `DeepASR <https://github.com/PaddlePaddle/models/blob/develop/fluid/DeepASR/README_cn.md>`__
- `DeepASR <https://github.com/PaddlePaddle/models/blob/develop/PaddleSpeech/DeepASR/README_cn.md>`__
机器翻译
--------
......@@ -125,7 +122,7 @@ RNN 结构的 NMT 得以应运而生,例如基于卷积神经网络 CNN
Attention 学习语言中的上下文依赖。相较于RNN/CNN,
这种结构在单层内计算复杂度更低、易于并行化、对长程依赖更易建模,最终在多种语言之间取得了最好的翻译效果。
- `Transformer <https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleNLP/neural_machine_translation/transformer/README_cn.md>`__
- `Transformer <https://github.com/PaddlePaddle/models/blob/develop/PaddleNLP/neural_machine_translation/transformer/README_cn.md>`__
强化学习
--------
......@@ -141,7 +138,7 @@ AlphaGo 就是 DRL
Q-Network, DQN)。本实例就是利用PaddlePaddle Fluid这个灵活的框架,实现了
DQN 及其变体,并测试了它们在 Atari 游戏中的表现。
- `DeepQNetwork <https://github.com/PaddlePaddle/models/blob/develop/fluid/DeepQNetwork/README_cn.md>`__
- `DeepQNetwork <https://github.com/PaddlePaddle/models/blob/develop/PaddleRL/DeepQNetwork/README_cn.md>`__
中文词法分析
------------
......@@ -166,7 +163,7 @@ DQN 及其变体,并测试了它们在 Atari 游戏中的表现。
本例所开放的DAM (Deep Attention Matching Network)为百度自然语言处理部发表于ACL-2018的工作,用于检索式聊天机器人多轮对话中应答的选择。DAM受Transformer的启发,其网络结构完全基于注意力(attention)机制,利用栈式的self-attention结构分别学习不同粒度下应答和语境的语义表示,然后利用cross-attention获取应答与语境之间的相关性,在两个大规模多轮对话数据集上的表现均好于其它模型。
- `Deep Attention Matching Network <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleNLP/deep_attention_matching_net>`__
- `Deep Attention Matching Network <https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/deep_attention_matching_net>`__
AnyQ
----
......@@ -187,7 +184,7 @@ SimNet是百度自然语言处理部于2013年自主研发的语义匹配框架
百度阅读理解数据集是由百度自然语言处理部开源的一个真实世界数据集,所有的问题、原文都来源于实际数据(百度搜索引擎数据和百度知道问答社区),答案是由人类回答的。每个问题都对应多个答案,数据集包含200k问题、1000k原文和420k答案,是目前最大的中文MRC数据集。百度同时开源了对应的阅读理解模型,称为DuReader,采用当前通用的网络分层结构,通过双向attention机制捕捉问题和原文之间的交互关系,生成query-aware的原文表示,最终基于query-aware的原文表示通过point network预测答案范围。
- `DuReader in PaddlePaddle Fluid <https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleNLP/machine_reading_comprehension/README.md>`__
- `DuReader in PaddlePaddle Fluid <https://github.com/PaddlePaddle/models/blob/develop/PaddleNLP/machine_reading_comprehension/README.md>`__
个性化推荐
......@@ -197,8 +194,8 @@ SimNet是百度自然语言处理部于2013年自主研发的语义匹配框架
在工业可用的推荐系统中,推荐策略一般会被划分为多个模块串联执行。以新闻推荐系统为例,存在多个可以使用深度学习技术的环节,例如新闻的自动化标注,个性化新闻召回,个性化匹配与排序等。PaddlePaddle对推荐算法的训练提供了完整的支持,并提供了多种模型配置供用户选择。
- `TagSpace <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/tagspace>`_
- `GRU4Rec <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/gru4rec>`_
- `SequenceSemanticRetrieval <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/ssr>`_
- `DeepCTR <https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleRec/ctr/README.cn.md>`_
- `Multiview-Simnet <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/multiview_simnet>`_
- `TagSpace <https://github.com/PaddlePaddle/models/tree/develop/PaddleRec/tagspace>`_
- `GRU4Rec <https://github.com/PaddlePaddle/models/tree/develop/PaddleRec/gru4rec>`_
- `SequenceSemanticRetrieval <https://github.com/PaddlePaddle/models/tree/develop/PaddleRec/ssr>`_
- `DeepCTR <https://github.com/PaddlePaddle/models/blob/develop/PaddleRec/ctr/README.cn.md>`_
- `Multiview-Simnet <https://github.com/PaddlePaddle/models/tree/develop/PaddleRec/multiview_simnet>`_
`Fluid Model Library <https://github.com/PaddlePaddle/models/tree/develop/fluid>`__
`Fluid Model Library <https://github.com/PaddlePaddle/models>`__
============
Image classification
......@@ -7,17 +7,17 @@ Image classification
Image classification is based on the semantic information of images to distinguish different types of images. It is an important basic problem in computer vision. It is the basis of other high-level visual tasks such as object detection, image segmentation, object tracking, behavior analysis, face recognition, etc. The field has a wide range of applications. Such as: face recognition and intelligent video analysis in the security field, traffic scene recognition in the traffic field, content-based image retrieval and automatic classification of albums in the Internet field, image recognition in the medical field.
In the era of deep learning, the accuracy of image classification has been greatly improved. In the image classification task, we introduced how to train commonly used models in the classic dataset ImageNet, including AlexNet, VGG, GoogLeNet, ResNet, Inception- V4, MobileNet, DPN (Dual
Path Network), SE-ResNeXt model. We also provide open source \ `trained model <https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleCV/image_classification/README_cn.md#>`__\ to make it convenient for users to download and use. It also provides tools to convert Caffe models into PaddlePaddle Fluid model configurations and parameter files.
- `AlexNet <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `VGG <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `GoogleNet <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `Residual Network <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `Inception-v4 <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `MobileNet <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `Dual Path Network <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `SE-ResNeXt <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/models>`__
- `Convert Caffe model to Paddle Fluid configuration and model file tools <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/caffe2fluid>`__
Path Network), SE-ResNeXt model. We also provide open source \ `trained model <https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/image_classification/README_cn.md#>`__\ to make it convenient for users to download and use. It also provides tools to convert Caffe models into PaddlePaddle Fluid model configurations and parameter files.
- `AlexNet <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `VGG <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `GoogleNet <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `Residual Network <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `Inception-v4 <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `MobileNet <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `Dual Path Network <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `SE-ResNeXt <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification/models>`__
- `Convert Caffe model to Paddle Fluid configuration and model file tools <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/caffe2fluid>`__
Object Detection
-----------------
......@@ -28,8 +28,8 @@ In the object detection task, we introduced how to train general object detectio
Detecting human faces in an open environment, especially small, obscured and partially occluded faces is also a challenging task. We also introduced how to train Baidu's self-developed face detection PyramidBox model based on `WIDER FACE <http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/>`_ data. The algorithm won the `first place <http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/WiderFace_Results.html>`_ in multiple evaluations of WIDER FACE in March 2018 .
- `Single Shot MultiBox Detector <https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleCV/object_detection/README_cn.md>`__
- `Face Detector: PyramidBox <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/face_detection/README_cn.md>`_
- `Single Shot MultiBox Detector <https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/object_detection/README_cn.md>`__
- `Face Detector: PyramidBox <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/face_detection/README_cn.md>`_
Image semantic segmentation
----------------------------
......@@ -38,7 +38,7 @@ As the name suggests, Image Semantic Segmentation is to group/segment pixels acc
In the image semantic segmentation task, we introduce how to perform semantic segmentation based on Image Cascade Network (ICNet). Compared with other segmentation algorithms, ICNet takes into account the accuracy and speed.
- `ICNet <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/icnet>`__
- `ICNet <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/icnet>`__
Image Synthesis
-----------------
......@@ -47,8 +47,8 @@ Image Synthesis refers to generating a target image based on an input vector. Th
In the image synthesis task, we introduced how to use DCGAN and ConditioanlGAN to generate handwritten numbers, and also introduced CycleGAN for style migration.
- `DCGAN & ConditionalGAN <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/gan/c_gan>`__
- `CycleGAN <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/gan/cycle_gan>`__
- `DCGAN & ConditionalGAN <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/gan/c_gan>`__
- `CycleGAN <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/gan/cycle_gan>`__
Scene Text Recognition
-----------------------
......@@ -57,8 +57,8 @@ Rich textual information is usually contained in scene images, which plays an im
In the scene text recognition task, we introduce how to combine CNN-based image feature extraction and RNN-based sequence translation technology, eliminate artificial definition features, avoid character segmentation, and use automatically learned image features to complete character recognition. Currently, the CRNN-CTC model and the sequence-to-sequence model based on the attention mechanism are introduced.
- `CRNN-CTC model <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/ocr_recognition>`__
- `Attention Model <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/ocr_recognition>`__
- `CRNN-CTC model <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/ocr_recognition>`__
- `Attention Model <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/ocr_recognition>`__
Metric learning
......@@ -67,7 +67,7 @@ Metric learning
Metric learning is also called distance metric learning or similarity learning. Through the distance between learning objects, metric learning can be used to analyze the association and comparison of objects. It can be applied to practical problems like auxiliary classification, aggregation and also widely used in areas such as image retrieval and face recognition. In the past, for different tasks, it was necessary to select appropriate features and manually construct a distance function, but the metric learning can initially learn the metric distance function for a specific task from the main task according to different tasks. The combination of metric learning and deep learning has achieved good performance in the fields of face recognition/verification, human re-ID, image retrieval, etc. In this task, we mainly introduce the depth-based metric learning based on Fluid. The model contains loss functions such as triples and quaternions.
- `Metric Learning <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/metric_learning>`__
- `Metric Learning <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/metric_learning>`__
Video classification
......@@ -76,7 +76,7 @@ Video classification
Video classification is the basis of video comprehension tasks. Unlike image classification, classified objects are no longer still images, but a video object composed of multi-frame images containing speech data and motion information, so to understand video needs to get more context information. To be specific, it needs not only to understand what each frame image is, what it contains, but also to combine different frames to know the context related information. The video classification method mainly includes a method based on convolutional neural networks, recurrent neural networks, or a combination of the two. In this task, we introduce the Fluid-based video classification model, which currently includes the Temporal Segment Network (TSN) model, and we will continuously add more models.
- `TSN <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/video_classification>`__
- `TSN <https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/video_classification>`__
......@@ -87,7 +87,7 @@ Automatic Speech Recognition (ASR) is a technique for transcribing vocabulary co
Different from the end-to-end direct prediction for word distribution of the deep learning model `DeepSpeech <https://github.com/PaddlePaddle/DeepSpeech>`__ , this example is closer to the traditional language recognition process. With phoneme as the modeling unit, it focuses on the training of acoustic models in speech recognition, use `kaldi <http://www.kaldi-asr.org>`__ for feature extraction and label alignment of audio data, and integrate kaldi's decoder to complete decoding.
- `DeepASR <https://github.com/PaddlePaddle/models/blob/develop/fluid/DeepASR/README_cn.md>`__
- `DeepASR <https://github.com/PaddlePaddle/models/blob/develop/DeepASR/README_cn.md>`__
Machine Translation
---------------------
......@@ -97,7 +97,7 @@ Machine Translation transforms a natural language (source language) into another
The Transformer implemented in this example is a machine translation model based on the self-attention mechanism, in which there is no more RNN or CNN structure, but fully utilizes Attention to learn the context dependency. Compared with RNN/CNN, in a single layer, this structure has lower computational complexity, easier parallelization, and easier modeling for long-range dependencies, and finally achieves the best translation effect among multiple languages.
- `Transformer <https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleNLP/neural_machine_translation/transformer/README_cn.md>`__
- `Transformer <https://github.com/PaddlePaddle/models/blob/develop/PaddleNLP/neural_machine_translation/transformer/README_cn.md>`__
Reinforcement learning
-------------------------
......@@ -106,7 +106,7 @@ Reinforcement learning is an increasingly important machine learning direction i
The pioneering work of deep reinforcement learning is a successful application in Atari video games, which can directly accept high-dimensional input of video frames and predict the next action according to the image content end-to-end. The model used is called depth Q Network (Deep Q-Network, DQN). This example uses PaddlePaddle Fluid, our flexible framework, to implement DQN and its variants and test their performance in Atari games.
- `DeepQNetwork <https://github.com/PaddlePaddle/models/blob/develop/fluid/DeepQNetwork/README_cn.md>`__
- `DeepQNetwork <https://github.com/PaddlePaddle/models/blob/develop/DeepQNetwork/README_cn.md>`__
Chinese lexical analysis
---------------------------
......@@ -131,7 +131,7 @@ In many scenarios of natural language processing, it is necessary to measure the
The DAM (Deep Attention Matching Network) introduced in this example is the work of Baidu Natural Language Processing Department published in ACL-2018, which is used for the selection of responses in multi-round dialogue of retrieval chat robots. Inspired by Transformer, DAM is based entirely on the attention mechanism. It uses the stack-type self-attention structure to learn the semantic representations of responses and contexts at different granularities, and then uses cross-attention to obtain relativity between responses and contexts. The performance on the two large-scale multi-round dialogue datasets is better than other models.
- `Deep Attention Matching Network <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleNLP/deep_attention_matching_net>`__
- `Deep Attention Matching Network <https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/deep_attention_matching_net>`__
AnyQ
----
......@@ -151,7 +151,7 @@ Machine Reading Comprehension (MRC) is one of the core tasks in Natural Language
Baidu reading comprehension dataset is an open-source real-world dataset publicized by Baidu Natural Language Processing Department. All the questions and original texts are derived from actual data (Baidu search engine data and Baidu know Q&A community), and the answer is given by humans. Each question corresponds to multiple answers. The dataset contains 200k questions, 1000k original text and 420k answers. It is currently the largest Chinese MRC dataset. Baidu also publicized the corresponding open-source reading comprehension model, called DuReader. DuReader adopts the current common network hierarchical structure, and captures the interaction between the problems and the original texts through the double attention mechanism to generate the original representation of the query-aware. Finally, based on the original text of query-aware, the answer scope is predicted by point network.
- `DuReader in PaddlePaddle Fluid <https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleNLP/machine_reading_comprehension/README.md>`__
- `DuReader in PaddlePaddle Fluid <https://github.com/PaddlePaddle/models/blob/develop/PaddleNLP/machine_reading_comprehension/README.md>`__
Personalized recommendation
......@@ -161,8 +161,8 @@ The recommendation system is playing an increasingly important role in the curre
In an industrially adoptable recommendation system, the recommendation strategy is generally divided into multiple modules in series. Take the news recommendation system as an example. There are multiple procedures that can use deep learning techniques, such as automated annotation of news, personalized news recall, personalized matching and sorting. PaddlePaddle provides complete support for the training of recommendation algorithms and provides a variety of model configurations for users to choose from.
- `TagSpace <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/tagspace>`_
- `GRU4Rec <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/gru4rec>`_
- `SequenceSemanticRetrieval <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/ssr>`_
- `DeepCTR <https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleRec/ctr/README.cn.md>`_
- `Multiview-Simnet <https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/multiview_simnet>`_
- `TagSpace <https://github.com/PaddlePaddle/models/tree/develop/PaddleRec/tagspace>`_
- `GRU4Rec <https://github.com/PaddlePaddle/models/tree/develop/PaddleRec/gru4rec>`_
- `SequenceSemanticRetrieval <https://github.com/PaddlePaddle/models/tree/develop/PaddleRec/ssr>`_
- `DeepCTR <https://github.com/PaddlePaddle/models/blob/develop/PaddleRec/ctr/README.cn.md>`_
- `Multiview-Simnet <https://github.com/PaddlePaddle/models/tree/develop/PaddleRec/multiview_simnet>`_
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册