Image classification is based on the semantic information of images to distinguish different types of images. It is an important basic problem in computer vision. It is the basis of other high-level visual tasks such as object detection, image segmentation, object tracking, behavior analysis, face recognition, etc. The field has a wide range of applications. Such as: face recognition and intelligent video analysis in the security field, traffic scene recognition in the traffic field, content-based image retrieval and automatic classification of albums in the Internet field, image recognition in the medical field.
Image classification is based on the semantic information of images to distinguish different types of images. It is an important basic problem in computer vision. It is the basis of other high-level visual tasks such as object detection, image segmentation, object tracking, behavior analysis, face recognition, etc. The field has a wide range of applications, such as: face recognition and intelligent video analysis in the security field, traffic scene recognition in the traffic field, content-based image retrieval and automatic classification of music albums in the Internet field, image recognition in the medical field.
In the era of deep learning, the accuracy of image classification has been greatly improved. In the image classification task, we introduced how to train commonly used models in the classic dataset ImageNet, including AlexNet, VGG, GoogLeNet, ResNet, Inception- V4, MobileNet, DPN (Dual
In the era of deep learning, the accuracy of image classification has been greatly improved. In the image classification task, we introduced how to train commonly used models in the classic dataset ImageNet, including AlexNet, VGG, GoogLeNet, ResNet, Inception- V4, MobileNet, DPN (Dual
Path Network), SE-ResNeXt model. We also provide open source \ `trained model <https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/image_classification/README_cn.md#>`__\ to make it convenient for users to download and use. It also provides tools to convert Caffe models into PaddlePaddle Fluid model configurations and parameter files.
Path Network), SE-ResNeXt model. We also provide open source \ `trained model <https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/image_classification/README_cn.md#>`__\ to make it convenient for users to download and use. It also provides tools to convert Caffe models into PaddlePaddle Fluid model configurations and parameter files.
...
@@ -22,9 +22,9 @@ Path Network), SE-ResNeXt model. We also provide open source \ `trained model <h
...
@@ -22,9 +22,9 @@ Path Network), SE-ResNeXt model. We also provide open source \ `trained model <h
Object Detection
Object Detection
-----------------
-----------------
The goal of the object detection task is to give an image or a video frame, let the computer find the locations of all the objects, and give the specific category of each object. For humans, target detection is a very simple task. However, the computer can only "see" the number after the image is encoded. It is difficult to solve the high-level semantic concept such as human or object in the image or video frame, and it is more difficult to locate the area where the target appears in the image. At the same time, because the target will appear anywhere in the image or video frame, the shape of the target is ever-changing, and the background of the image or video frame varies widely. Many factors make the object detection a challenging problem for the computer.
The goal of the object detection task is to give an image or a video frame, let the computer find the locations of all the objects, and give the specific category of each object. For humans, target detection is a very simple task. However, the computer can only "see" the number after the image is encoded. It is difficult to resolve the high-level semantic concept such as human or object in the image or video frame, and it is more difficult to locate the area where the target appears in the image. At the same time, because the target will appear anywhere in the image or video frame, the shape of the target is ever-changing, and the background of the image or video frame varies widely. Many factors make the object detection a challenging problem for the computer.
In the object detection task, we introduced how to train general object detection model based on dataset `PASCAL VOC <http://host.robots.ox.ac.uk/pascal/VOC/>`__\ , \ `MS COCO <http://cocodataset. Org/#home>`__\ . Currently we introduced SSD algorithm, which is the acronym for Single Shot MultiBox Detector. As one of the newer and better detection algorithms in the object detection field, it features fast detection speed and detection High precision.
In the object detection task, we introduced how to train general object detection model based on dataset `PASCAL VOC <http://host.robots.ox.ac.uk/pascal/VOC/>`__\ , \ `MS COCO <http://cocodataset. Org/#home>`__\ . Currently we introduced SSD algorithm, which is the acronym for Single Shot MultiBox Detector. As one of the newer and better detection algorithms in the object detection field, it features fast detection speed and high precision detection.
Detecting human faces in an open environment, especially small, obscured and partially occluded faces is also a challenging task. We also introduced how to train Baidu's self-developed face detection PyramidBox model based on `WIDER FACE <http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/>`_ data. The algorithm won the `first place <http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/WiderFace_Results.html>`_ in multiple evaluations of WIDER FACE in March 2018 .
Detecting human faces in an open environment, especially small, obscured and partially occluded faces is also a challenging task. We also introduced how to train Baidu's self-developed face detection PyramidBox model based on `WIDER FACE <http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/>`_ data. The algorithm won the `first place <http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/WiderFace_Results.html>`_ in multiple evaluations of WIDER FACE in March 2018 .
...
@@ -45,7 +45,7 @@ Image Synthesis
...
@@ -45,7 +45,7 @@ Image Synthesis
Image Synthesis refers to generating a target image based on an input vector. The input vector here can be random noise or a user-specified condition vector. Specific application scenarios include: handwriting generation, face synthesis, style migration, image restoration, and the like. Current image generation tasks are primarily achieved by Generative Adversarial Networks (GAN). The GAN consists of two subnetworks: a generator and a discriminator. The input to the generator is a random noise or condition vector and the output is the target image. The discriminator is a classifier, the input is an image, and the output is whether the image is a real image. During the training process, the generator and the discriminator enhance their abilities through constant mutual adversarial process.
Image Synthesis refers to generating a target image based on an input vector. The input vector here can be random noise or a user-specified condition vector. Specific application scenarios include: handwriting generation, face synthesis, style migration, image restoration, and the like. Current image generation tasks are primarily achieved by Generative Adversarial Networks (GAN). The GAN consists of two subnetworks: a generator and a discriminator. The input to the generator is a random noise or condition vector and the output is the target image. The discriminator is a classifier, the input is an image, and the output is whether the image is a real image. During the training process, the generator and the discriminator enhance their abilities through constant mutual adversarial process.
In the image synthesis task, we introduced how to use DCGAN and ConditioanlGAN to generate handwritten numbers, and also introduced CycleGAN for style migration.
In the image synthesis task, we introduced how to use DCGAN and ConditionalGAN to generate handwritten numbers, and also introduced CycleGAN for style migration.
Metric learning is also called distance metric learning or similarity learning. Through the distance between learning objects, metric learning can be used to analyze the association and comparison of objects. It can be applied to practical problems like auxiliary classification, aggregation and also widely used in areas such as image retrieval and face recognition. In the past, for different tasks, it was necessary to select appropriate features and manually construct a distance function, but the metric learning can initially learn the metric distance function for a specific task from the main task according to different tasks. The combination of metric learning and deep learning has achieved good performance in the fields of face recognition/verification, human re-ID, image retrieval, etc. In this task, we mainly introduce the depth-based metric learning based on Fluid. The model contains loss functions such as triples and quaternions.
Metric learning is also called distance metric learning or similarity learning. Through the distance between learning objects, metric learning can be used to analyze the association and comparison of objects. It can be applied to practical problems like auxiliary classification, aggregation and also widely used in areas such as image retrieval and face recognition. In the past, for different tasks, it was necessary to select appropriate features and manually construct a distance function, but metric learning can initially learn the metric distance function for a specific task from the main task according to different tasks. The combination of metric learning and deep learning has achieved good performance in the fields of face recognition/verification, human re-ID, image retrieval, etc. In this task, we mainly introduce the depth-based metric learning based on Fluid. The model contains loss functions such as triples and quaternions.
@@ -92,7 +92,7 @@ Different from the end-to-end direct prediction for word distribution of the dee
...
@@ -92,7 +92,7 @@ Different from the end-to-end direct prediction for word distribution of the dee
Machine Translation
Machine Translation
---------------------
---------------------
Machine Translation transforms a natural language (source language) into another natural language (target language), which is a very basic and important research direction in natural language processing. In the wave of globalization, the important role played by machine translation in promoting cross-language civilization communication is self-evident. Its development has gone through stages such as statistical machine translation and neural-network-based Neuro Machine Translation (NMT). After NMT matured, machine translation was really applied on a large scale. The early stage of NMT is mainly based on the recurrent neural network RNN. The current time step in the training process depends on the calculation of the previous time step, so it is difficult to parallelize the time steps to improve the training speed. Therefore, NMTs of non-RNN structures have emerged, such as structures based on convolutional neural networks CNN and structures based on Self-Attention.
Machine Translation transforms a natural language (source language) into another natural language (target language), which is a very basic and important function in natural language processing. With the wave of globalization, the important role played by machine translation in promoting cross-language civilization communication is self-evident. Its development has gone through stages such as statistical machine translation (SMT) and neural-network-based Neuro Machine Translation (NMT). After NMT matured, machine translation was really applied on a large scale. The early stage of NMT is mainly based on the recurrent neural network RNN. The current time step in the training process depends on the calculation of the previous time step, so it is difficult to parallelize the time steps to improve the training speed. Therefore, NMTs of non-RNN structures have emerged, such as structures based on convolutional neural networks CNN and structures based on Self-Attention.
The Transformer implemented in this example is a machine translation model based on the self-attention mechanism, in which there is no more RNN or CNN structure, but fully utilizes Attention to learn the context dependency. Compared with RNN/CNN, in a single layer, this structure has lower computational complexity, easier parallelization, and easier modeling for long-range dependencies, and finally achieves the best translation effect among multiple languages.
The Transformer implemented in this example is a machine translation model based on the self-attention mechanism, in which there is no more RNN or CNN structure, but fully utilizes Attention to learn the context dependency. Compared with RNN/CNN, in a single layer, this structure has lower computational complexity, easier parallelization, and easier modeling for long-range dependencies, and finally achieves the best translation effect among multiple languages.
...
@@ -157,9 +157,9 @@ Baidu reading comprehension dataset is an open-source real-world dataset publici
...
@@ -157,9 +157,9 @@ Baidu reading comprehension dataset is an open-source real-world dataset publici
Personalized recommendation
Personalized recommendation
---------------------------------
---------------------------------
The recommendation system is playing an increasingly important role in the current Internet service. At present, most e-commerce systems, social networks, advertisement recommendation, and search engines all use various forms of personalized recommendation technology to help users quickly find the information they want.
Recommendation systems are playing an increasingly important role in internet services and applications. At present, most e-commerce systems, social networks, advertisement recommendation, and search engines all use various forms of personalized recommendation technology to help users quickly find the information, products or services they are looking for, or are closely related.
In an industrially adoptable recommendation system, the recommendation strategy is generally divided into multiple modules in series. Take the news recommendation system as an example. There are multiple procedures that can use deep learning techniques, such as automated annotation of news, personalized news recall, personalized matching and sorting. PaddlePaddle provides complete support for the training of recommendation algorithms and provides a variety of model configurations for users to choose from.
In an industrial-scale recommendation system, the recommendation strategy is generally divided into multiple modules in series. Take the news recommendation system as an example. There are multiple procedures that can use deep learning techniques, such as automated annotation of news, personalized news recall, personalized matching and sorting. PaddlePaddle provides complete support for the training of recommendation algorithms and provides a variety of model configurations for users to choose from.