提交 a41d1c84 编写于 作者: W wizardforcel

2019-01-22 12:07:10

上级 8458019e
# ONNX Live Tutorial
This tutorial will show you to convert a neural style transfer model that has been exported from PyTorch into the Apple CoreML format using ONNX. This will allow you to easily run deep learning models on Apple devices and, in this case, live stream from the camera.
## What is ONNX?
ONNX (Open Neural Network Exchange) is an open format to represent deep learning models. With ONNX, AI developers can more easily move models between state-of-the-art tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners. You can learn more about ONNX and what tools are supported by going to [onnx.ai](https://onnx.ai/).
## Tutorial Overview
This tutorial will walk you through 4 main steps:
1. [Download (or train) PyTorch style transfer models](#download-or-train-pytorch-style-transfer-models)
2. [Convert the PyTorch models to ONNX models](#convert-the-pytorch-models-to-onnx-models)
3. [Convert the ONNX models to CoreML models](#convert-the-onnx-models-to-coreml-models)
4. [Run the CoreML models in a style transfer iOS App](#run-the-coreml-models-in-a-style-transfer-ios-app)
## Preparing the Environment
We will be working in a virtualenv in order to avoid conflicts with your local packages. We are also using Python 3.6 for this tutorial, but other versions should work as well.
```py
python3.6 -m venv venv
source ./venv/bin/activate
```
You need to install pytorch and the onnx->coreml converter:
```py
pip install torchvision onnx-coreml
```
You will also need to install XCode if you want to run the iOS style transfer app on your iPhone. You can also convert models in Linux, however to run the iOS app itself, you will need a Mac.
## Download (or train) PyTorch style transfer models
For this tutorial, we will use the style transfer models that are published with pytorch in [https://github.com/pytorch/examples/tree/master/fast_neural_style](https://github.com/pytorch/examples/tree/master/fast_neural_style) . If you would like to use a different PyTorch or ONNX model, feel free to skip this step.
These models are meant for applying style transfer on still images and really not optimized to be fast enough for video. However if we reduce the resolution low enough, they can also work well on videos.
Let’s download the models:
```py
git clone https://github.com/pytorch/examples
cd examples/fast_neural_style
```
If you would like to train the models yourself, the pytorch/examples repository you just cloned has more information on how to do this. For now, we’ll just download pre-trained models with the script provided by the repository:
```py
./download_saved_models.sh
```
This script downloads the pre-trained PyTorch models and puts them into the `saved_models` folder. There should now be 4 files, `candy.pth`, `mosaic.pth`, `rain_princess.pth` and `udnie.pth` in your directory.
## Convert the PyTorch models to ONNX models
Now that we have the pre-trained PyTorch models as `.pth` files in the `saved_models` folder, we will need to convert them to ONNX format. The model definition is in the pytorch/examples repository we cloned previously, and with a few lines of python we can export it to ONNX. In this case, instead of actually running the neural net, we will call `torch.onnx._export`, which is provided with PyTorch as an api to directly export ONNX formatted models from PyTorch. However, in this case we don’t even need to do that, because a script already exists `neural_style/neural_style.py` that will do this for us. You can also take a look at that script if you would like to apply it to other models.
Exporting the ONNX format from PyTorch is essentially tracing your neural network so this api call will internally run the network on ‘dummy data’ in order to generate the graph. For this, it needs an input image to apply the style transfer to which can simply be a blank image. However, the pixel size of this image is important, as this will be the size for the exported style transfer model. To get good performance, we’ll use a resolution of 250x540\. Feel free to take a larger resolution if you care less about FPS and more about style transfer quality.
Let’s use [ImageMagick](https://www.imagemagick.org/) to create a blank image of the resolution we want:
```py
convert -size 250x540 xc:white png24:dummy.jpg
```
and use that to export the PyTorch models:
```py
python ./neural_style/neural_style.py eval --content-image dummy.jpg --output-image dummy-out.jpg --model ./saved_models/candy.pth --cuda 0 --export_onnx ./saved_models/candy.onnx
python ./neural_style/neural_style.py eval --content-image dummy.jpg --output-image dummy-out.jpg --model ./saved_models/udnie.pth --cuda 0 --export_onnx ./saved_models/udnie.onnx
python ./neural_style/neural_style.py eval --content-image dummy.jpg --output-image dummy-out.jpg --model ./saved_models/rain_princess.pth --cuda 0 --export_onnx ./saved_models/rain_princess.onnx
python ./neural_style/neural_style.py eval --content-image dummy.jpg --output-image dummy-out.jpg --model ./saved_models/mosaic.pth --cuda 0 --export_onnx ./saved_models/mosaic.onnx
```
You should end up with 4 files, `candy.onnx`, `mosaic.onnx`, `rain_princess.onnx` and `udnie.onnx`, created from the corresponding `.pth` files.
## Convert the ONNX models to CoreML models
Now that we have ONNX models, we can convert them to CoreML models in order to run them on Apple devices. For this, we use the onnx-coreml converter we installed previously. The converter comes with a `convert-onnx-to-coreml` script, which the installation steps above added to our path. Unfortunately that won’t work for us as we need to mark the input and output of the network as an image and, while this is supported by the converter, it is only supported when calling the converter from python.
Looking at the style transfer model (for example opening the .onnx file in an application like [Netron](https://github.com/lutzroeder/Netron)), we see that the input is named ‘0’ and the output is named ‘186’. These are just numeric ids assigned by PyTorch. We will need to mark these as images.
So let’s create a small python file and call it `onnx_to_coreml.py`. This can be created by using the touch command and edited with your favorite editor to add the following lines of code.
```py
import sys
from onnx import onnx_pb
from onnx_coreml import convert
model_in = sys.argv[1]
model_out = sys.argv[2]
model_file = open(model_in, 'rb')
model_proto = onnx_pb.ModelProto()
model_proto.ParseFromString(model_file.read())
coreml_model = convert(model_proto, image_input_names=['0'], image_output_names=['186'])
coreml_model.save(model_out)
```
we now run it:
```py
python onnx_to_coreml.py ./saved_models/candy.onnx ./saved_models/candy.mlmodel
python onnx_to_coreml.py ./saved_models/udnie.onnx ./saved_models/udnie.mlmodel
python onnx_to_coreml.py ./saved_models/rain_princess.onnx ./saved_models/rain_princess.mlmodel
python onnx_to_coreml.py ./saved_models/mosaic.onnx ./saved_models/mosaic.mlmodel
```
Now, there should be 4 CoreML models in your `saved_models` directory: `candy.mlmodel`, `mosaic.mlmodel`, `rain_princess.mlmodel` and `udnie.mlmodel`.
## Run the CoreML models in a style transfer iOS App
This repository (i.e. the one you’re currently reading the README.md of) contains an iOS app able to run CoreML style transfer models on a live camera stream from your phone camera. Let’s clone the repository:
```py
git clone https://github.com/onnx/tutorials
```
and open the `tutorials/examples/CoreML/ONNXLive/ONNXLive.xcodeproj` project in XCode. We recommend using XCode 9.3 and an iPhone X. There might be issues running on older devices or XCode versions.
In the `Models/` folder, the project contains some .mlmodel files. We’re going to replace them with the models we just created.
You then run the app on your iPhone and you are all set. Tapping on the screen switches through the models.
## Conclusion
We hope this tutorial gave you an overview of what ONNX is about and how you can use it to convert neural networks between frameworks, in this case neural style transfer models moving from PyTorch to CoreML.
Feel free to experiment with these steps and test them on your own models. Please let us know if you hit any issues or want to give feedback. We’d like to hear what you think.
+ 中文教程
+ [Getting Started](tut_getting_started.md)
+ [Deep Learning with PyTorch: A 60 Minute Blitz](deep_learning_60min_blitz.md)
+ [Data Loading and Processing Tutorial](data_loading_tutorial.md)
+ [Learning PyTorch with Examples](pytorch_with_examples.md)
+ [Transfer Learning Tutorial](transfer_learning_tutorial.md)
+ [Deploying a Seq2Seq Model with the Hybrid Frontend](deploy_seq2seq_hybrid_frontend_tutorial.md)
+ [Saving and Loading Models](saving_loading_models.md)
+ [What is <cite>torch.nn</cite> _really_?](nn_tutorial.md)
+ [Image](tut_image.md)
+ [Finetuning Torchvision Models](finetuning_torchvision_models_tutorial.md)
+ [Spatial Transformer Networks Tutorial](spatial_transformer_tutorial.md)
+ [Neural Transfer Using PyTorch](neural_style_tutorial.md)
+ [Adversarial Example Generation](fgsm_tutorial.md)
+ [Transfering a Model from PyTorch to Caffe2 and Mobile using ONNX](super_resolution_with_caffe2.md)
+ [Text](tut_text.md)
+ [Chatbot Tutorial](chatbot_tutorial.md)
+ [Generating Names with a Character-Level RNN](char_rnn_generation_tutorial.md)
+ [Classifying Names with a Character-Level RNN](char_rnn_classification_tutorial.md)
+ [Deep Learning for NLP with Pytorch](deep_learning_nlp_tutorial.md)
+ [Translation with a Sequence to Sequence Network and Attention](seq2seq_translation_tutorial.md)
+ [Generative](tut_generative.md)
+ [DCGAN Tutorial](dcgan_faces_tutorial.md)
+ [Reinforcement Learning](tut_reinforcement_learning.md)
+ [Reinforcement Learning (DQN) Tutorial](reinforcement_q_learning.md)
+ [Extending PyTorch](tut_extending_pytorch.md)
+ [Creating Extensions Using numpy and scipy](numpy_extensions_tutorial.md)
+ [Custom C++ and CUDA Extensions](cpp_extension.md)
+ [Extending TorchScript with Custom C++ Operators](torch_script_custom_ops.md)
+ [Production Usage](tut_production_usage.md)
+ [Writing Distributed Applications with PyTorch](dist_tuto.md)
+ [PyTorch 1.0 Distributed Trainer with Amazon AWS](aws_distributed_training_tutorial.md)
+ [ONNX Live Tutorial](ONNXLive.md)
+ [Loading a PyTorch Model in C++](cpp_export.md)
+ [PyTorch in Other Languages](tut_other_language.md)
+ [Using the PyTorch C++ Frontend](cpp_frontend.md)
此差异已折叠。
# Classifying Names with a Character-Level RNN
**Author**: [Sean Robertson](https://github.com/spro/practical-pytorch)
We will be building and training a basic character-level RNN to classify words. A character-level RNN reads words as a series of characters - outputting a prediction and “hidden state” at each step, feeding its previous hidden state into each next step. We take the final prediction to be the output, i.e. which class the word belongs to.
Specifically, we’ll train on a few thousand surnames from 18 languages of origin, and predict which language a name is from based on the spelling:
```py
$ python predict.py Hinton
(-0.47) Scottish
(-1.52) English
(-3.57) Irish
$ python predict.py Schmidhuber
(-0.19) German
(-2.48) Czech
(-2.68) Dutch
```
**Recommended Reading:**
I assume you have at least installed PyTorch, know Python, and understand Tensors:
* [https://pytorch.org/](https://pytorch.org/) For installation instructions
* [Deep Learning with PyTorch: A 60 Minute Blitz](../beginner/deep_learning_60min_blitz.html) to get started with PyTorch in general
* [Learning PyTorch with Examples](../beginner/pytorch_with_examples.html) for a wide and deep overview
* [PyTorch for Former Torch Users](../beginner/former_torchies_tutorial.html) if you are former Lua Torch user
It would also be useful to know about RNNs and how they work:
* [The Unreasonable Effectiveness of Recurrent Neural Networks](https://karpathy.github.io/2015/05/21/rnn-effectiveness/) shows a bunch of real life examples
* [Understanding LSTM Networks](https://colah.github.io/posts/2015-08-Understanding-LSTMs/) is about LSTMs specifically but also informative about RNNs in general
## Preparing the Data
Note
Download the data from [here](https://download.pytorch.org/tutorial/data.zip) and extract it to the current directory.
Included in the `data/names` directory are 18 text files named as “[Language].txt”. Each file contains a bunch of names, one name per line, mostly romanized (but we still need to convert from Unicode to ASCII).
We’ll end up with a dictionary of lists of names per language, `{language: [names ...]}`. The generic variables “category” and “line” (for language and name in our case) are used for later extensibility.
```py
from __future__ import unicode_literals, print_function, division
from io import open
import glob
import os
def findFiles(path): return glob.glob(path)
print(findFiles('data/names/*.txt'))
import unicodedata
import string
all_letters = string.ascii_letters + " .,;'"
n_letters = len(all_letters)
# Turn a Unicode string to plain ASCII, thanks to https://stackoverflow.com/a/518232/2809427
def unicodeToAscii(s):
return ''.join(
c for c in unicodedata.normalize('NFD', s)
if unicodedata.category(c) != 'Mn'
and c in all_letters
)
print(unicodeToAscii('Ślusàrski'))
# Build the category_lines dictionary, a list of names per language
category_lines = {}
all_categories = []
# Read a file and split into lines
def readLines(filename):
lines = open(filename, encoding='utf-8').read().strip().split('\n')
return [unicodeToAscii(line) for line in lines]
for filename in findFiles('data/names/*.txt'):
category = os.path.splitext(os.path.basename(filename))[0]
all_categories.append(category)
lines = readLines(filename)
category_lines[category] = lines
n_categories = len(all_categories)
```
Out:
```py
['data/names/Italian.txt', 'data/names/German.txt', 'data/names/Portuguese.txt', 'data/names/Chinese.txt', 'data/names/Greek.txt', 'data/names/Polish.txt', 'data/names/French.txt', 'data/names/English.txt', 'data/names/Spanish.txt', 'data/names/Arabic.txt', 'data/names/Czech.txt', 'data/names/Russian.txt', 'data/names/Irish.txt', 'data/names/Dutch.txt', 'data/names/Scottish.txt', 'data/names/Vietnamese.txt', 'data/names/Korean.txt', 'data/names/Japanese.txt']
Slusarski
```
Now we have `category_lines`, a dictionary mapping each category (language) to a list of lines (names). We also kept track of `all_categories` (just a list of languages) and `n_categories` for later reference.
```py
print(category_lines['Italian'][:5])
```
Out:
```py
['Abandonato', 'Abatangelo', 'Abatantuono', 'Abate', 'Abategiovanni']
```
### Turning Names into Tensors
Now that we have all the names organized, we need to turn them into Tensors to make any use of them.
To represent a single letter, we use a “one-hot vector” of size `<1 x n_letters>`. A one-hot vector is filled with 0s except for a 1 at index of the current letter, e.g. `"b" = <0 1 0 0 0 ...>`.
To make a word we join a bunch of those into a 2D matrix `<line_length x 1 x n_letters>`.
That extra 1 dimension is because PyTorch assumes everything is in batches - we’re just using a batch size of 1 here.
```py
import torch
# Find letter index from all_letters, e.g. "a" = 0
def letterToIndex(letter):
return all_letters.find(letter)
# Just for demonstration, turn a letter into a <1 x n_letters> Tensor
def letterToTensor(letter):
tensor = torch.zeros(1, n_letters)
tensor[0][letterToIndex(letter)] = 1
return tensor
# Turn a line into a <line_length x 1 x n_letters>,
# or an array of one-hot letter vectors
def lineToTensor(line):
tensor = torch.zeros(len(line), 1, n_letters)
for li, letter in enumerate(line):
tensor[li][0][letterToIndex(letter)] = 1
return tensor
print(letterToTensor('J'))
print(lineToTensor('Jones').size())
```
Out:
```py
tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0.]])
torch.Size([5, 1, 57])
```
## Creating the Network
Before autograd, creating a recurrent neural network in Torch involved cloning the parameters of a layer over several timesteps. The layers held hidden state and gradients which are now entirely handled by the graph itself. This means you can implement a RNN in a very “pure” way, as regular feed-forward layers.
This RNN module (mostly copied from [the PyTorch for Torch users tutorial](https://pytorch.org/tutorials/beginner/former_torchies/nn_tutorial.html#example-2-recurrent-net)) is just 2 linear layers which operate on an input and hidden state, with a LogSoftmax layer after the output.
![](img/592fae78143370fffc1d0c7957706384.jpg)
```py
import torch.nn as nn
class RNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(RNN, self).__init__()
self.hidden_size = hidden_size
self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
self.i2o = nn.Linear(input_size + hidden_size, output_size)
self.softmax = nn.LogSoftmax(dim=1)
def forward(self, input, hidden):
combined = torch.cat((input, hidden), 1)
hidden = self.i2h(combined)
output = self.i2o(combined)
output = self.softmax(output)
return output, hidden
def initHidden(self):
return torch.zeros(1, self.hidden_size)
n_hidden = 128
rnn = RNN(n_letters, n_hidden, n_categories)
```
To run a step of this network we need to pass an input (in our case, the Tensor for the current letter) and a previous hidden state (which we initialize as zeros at first). We’ll get back the output (probability of each language) and a next hidden state (which we keep for the next step).
```py
input = letterToTensor('A')
hidden =torch.zeros(1, n_hidden)
output, next_hidden = rnn(input, hidden)
```
For the sake of efficiency we don’t want to be creating a new Tensor for every step, so we will use `lineToTensor` instead of `letterToTensor` and use slices. This could be further optimized by pre-computing batches of Tensors.
```py
input = lineToTensor('Albert')
hidden = torch.zeros(1, n_hidden)
output, next_hidden = rnn(input[0], hidden)
print(output)
```
Out:
```py
tensor([[-2.8857, -2.9005, -2.8386, -2.9397, -2.8594, -2.8785, -2.9361, -2.8270,
-2.9602, -2.8583, -2.9244, -2.9112, -2.8545, -2.8715, -2.8328, -2.8233,
-2.9685, -2.9780]], grad_fn=<LogSoftmaxBackward>)
```
As you can see the output is a `&lt;1 x n_categories&gt;` Tensor, where every item is the likelihood of that category (higher is more likely).
## Training
### Preparing for Training
Before going into training we should make a few helper functions. The first is to interpret the output of the network, which we know to be a likelihood of each category. We can use `Tensor.topk` to get the index of the greatest value:
```py
def categoryFromOutput(output):
top_n, top_i = output.topk(1)
category_i = top_i[0].item()
return all_categories[category_i], category_i
print(categoryFromOutput(output))
```
Out:
```py
('Vietnamese', 15)
```
We will also want a quick way to get a training example (a name and its language):
```py
import random
def randomChoice(l):
return l[random.randint(0, len(l) - 1)]
def randomTrainingExample():
category = randomChoice(all_categories)
line = randomChoice(category_lines[category])
category_tensor = torch.tensor([all_categories.index(category)], dtype=torch.long)
line_tensor = lineToTensor(line)
return category, line, category_tensor, line_tensor
for i in range(10):
category, line, category_tensor, line_tensor = randomTrainingExample()
print('category =', category, '/ line =', line)
```
Out:
```py
category = Russian / line = Minkin
category = French / line = Masson
category = German / line = Hasek
category = Dutch / line = Kloeten
category = Scottish / line = Allan
category = Italian / line = Agostini
category = Japanese / line = Fumihiko
category = Polish / line = Gajos
category = Scottish / line = Duncan
category = Arabic / line = Gerges
```
### Training the Network
Now all it takes to train this network is show it a bunch of examples, have it make guesses, and tell it if it’s wrong.
For the loss function `nn.NLLLoss` is appropriate, since the last layer of the RNN is `nn.LogSoftmax`.
```py
criterion = nn.NLLLoss()
```
Each loop of training will:
* Create input and target tensors
* Create a zeroed initial hidden state
* Read each letter in and
* Keep hidden state for next letter
* Compare final output to target
* Back-propagate
* Return the output and loss
```py
learning_rate = 0.005 # If you set this too high, it might explode. If too low, it might not learn
def train(category_tensor, line_tensor):
hidden = rnn.initHidden()
rnn.zero_grad()
for i in range(line_tensor.size()[0]):
output, hidden = rnn(line_tensor[i], hidden)
loss = criterion(output, category_tensor)
loss.backward()
# Add parameters' gradients to their values, multiplied by learning rate
for p in rnn.parameters():
p.data.add_(-learning_rate, p.grad.data)
return output, loss.item()
```
Now we just have to run that with a bunch of examples. Since the `train` function returns both the output and loss we can print its guesses and also keep track of loss for plotting. Since there are 1000s of examples we print only every `print_every` examples, and take an average of the loss.
```py
import time
import math
n_iters = 100000
print_every = 5000
plot_every = 1000
# Keep track of losses for plotting
current_loss = 0
all_losses = []
def timeSince(since):
now = time.time()
s = now - since
m = math.floor(s / 60)
s -= m * 60
return '%dm %ds' % (m, s)
start = time.time()
for iter in range(1, n_iters + 1):
category, line, category_tensor, line_tensor = randomTrainingExample()
output, loss = train(category_tensor, line_tensor)
current_loss += loss
# Print iter number, loss, name and guess
if iter % print_every == 0:
guess, guess_i = categoryFromOutput(output)
correct = '✓' if guess == category else '✗ (%s)' % category
print('%d %d%% (%s) %.4f %s / %s %s' % (iter, iter / n_iters * 100, timeSince(start), loss, line, guess, correct))
# Add current loss avg to list of losses
if iter % plot_every == 0:
all_losses.append(current_loss / plot_every)
current_loss = 0
```
Out:
```py
5000 5% (0m 11s) 2.0318 Jaeger / German
10000 10% (0m 18s) 2.1296 Sokolofsky / Russian (Polish)
15000 15% (0m 26s) 1.2620 Jo / Korean
20000 20% (0m 34s) 1.9295 Livson / Scottish (Russian)
25000 25% (0m 41s) 1.2325 Fortier / French
30000 30% (0m 49s) 2.5714 Purdes / Dutch (Czech)
35000 35% (0m 56s) 2.3312 Bayer / Arabic (German)
40000 40% (1m 4s) 2.3792 Mitchell / Dutch (Scottish)
45000 45% (1m 12s) 1.3536 Maes / Dutch
50000 50% (1m 20s) 2.6095 Sai / Chinese (Vietnamese)
55000 55% (1m 28s) 0.5883 Cheung / Chinese
60000 60% (1m 35s) 1.5788 William / Irish
65000 65% (1m 43s) 2.5809 Mulder / Scottish (Dutch)
70000 70% (1m 51s) 1.3440 Bruce / German (Scottish)
75000 75% (1m 58s) 1.1839 Romero / Italian (Spanish)
80000 80% (2m 6s) 2.6453 Reyes / Portuguese (Spanish)
85000 85% (2m 14s) 0.0290 Mcmillan / Scottish
90000 90% (2m 22s) 0.7337 Riagan / Irish
95000 95% (2m 30s) 2.6208 Maneates / Dutch (Greek)
100000 100% (2m 37s) 0.5170 Szwarc / Polish
```
### Plotting the Results
Plotting the historical loss from `all_losses` shows the network learning:
```py
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
plt.figure()
plt.plot(all_losses)
```
![https://pytorch.org/tutorials/_images/sphx_glr_char_rnn_classification_tutorial_001.png](img/cc57a36a43d450df4bfc1d1d1b1ce274.jpg)
## Evaluating the Results
To see how well the network performs on different categories, we will create a confusion matrix, indicating for every actual language (rows) which language the network guesses (columns). To calculate the confusion matrix a bunch of samples are run through the network with `evaluate()`, which is the same as `train()` minus the backprop.
```py
# Keep track of correct guesses in a confusion matrix
confusion = torch.zeros(n_categories, n_categories)
n_confusion = 10000
# Just return an output given a line
def evaluate(line_tensor):
hidden = rnn.initHidden()
for i in range(line_tensor.size()[0]):
output, hidden = rnn(line_tensor[i], hidden)
return output
# Go through a bunch of examples and record which are correctly guessed
for i in range(n_confusion):
category, line, category_tensor, line_tensor = randomTrainingExample()
output = evaluate(line_tensor)
guess, guess_i = categoryFromOutput(output)
category_i = all_categories.index(category)
confusion[category_i][guess_i] += 1
# Normalize by dividing every row by its sum
for i in range(n_categories):
confusion[i] = confusion[i] / confusion[i].sum()
# Set up plot
fig = plt.figure()
ax = fig.add_subplot(111)
cax = ax.matshow(confusion.numpy())
fig.colorbar(cax)
# Set up axes
ax.set_xticklabels([''] + all_categories, rotation=90)
ax.set_yticklabels([''] + all_categories)
# Force label at every tick
ax.xaxis.set_major_locator(ticker.MultipleLocator(1))
ax.yaxis.set_major_locator(ticker.MultipleLocator(1))
# sphinx_gallery_thumbnail_number = 2
plt.show()
```
![https://pytorch.org/tutorials/_images/sphx_glr_char_rnn_classification_tutorial_002.png](img/029a9d26725997aae97e9e3f6f10067f.jpg)
You can pick out bright spots off the main axis that show which languages it guesses incorrectly, e.g. Chinese for Korean, and Spanish for Italian. It seems to do very well with Greek, and very poorly with English (perhaps because of overlap with other languages).
### Running on User Input
```py
def predict(input_line, n_predictions=3):
print('\n> %s' % input_line)
with torch.no_grad():
output = evaluate(lineToTensor(input_line))
# Get top N categories
topv, topi = output.topk(n_predictions, 1, True)
predictions = []
for i in range(n_predictions):
value = topv[0][i].item()
category_index = topi[0][i].item()
print('(%.2f) %s' % (value, all_categories[category_index]))
predictions.append([value, all_categories[category_index]])
predict('Dovesky')
predict('Jackson')
predict('Satoshi')
```
Out:
```py
> Dovesky
(-0.74) Russian
(-0.77) Czech
(-3.31) English
> Jackson
(-0.80) Scottish
(-1.69) English
(-1.84) Russian
> Satoshi
(-1.16) Japanese
(-1.89) Arabic
(-1.90) Polish
```
The final versions of the scripts [in the Practical PyTorch repo](https://github.com/spro/practical-pytorch/tree/master/char-rnn-classification) split the above code into a few files:
* `data.py` (loads files)
* `model.py` (defines the RNN)
* `train.py` (runs training)
* `predict.py` (runs `predict()` with command line arguments)
* `server.py` (serve prediction as a JSON API with bottle.py)
Run `train.py` to train and save the network.
Run `predict.py` with a name to view predictions:
```py
$ python predict.py Hazaki
(-0.42) Japanese
(-1.39) Polish
(-3.51) Czech
```
Run `server.py` and visit [http://localhost:5533/Yourname](http://localhost:5533/Yourname) to get JSON output of predictions.
## Exercises
* Try with a different dataset of line -&gt; category, for example:
* Any word -&gt; language
* First name -&gt; gender
* Character name -&gt; writer
* Page title -&gt; blog or subreddit
* Get better results with a bigger and/or better shaped network
* Add more linear layers
* Try the `nn.LSTM` and `nn.GRU` layers
* Combine multiple of these RNNs as a higher level network
**Total running time of the script:** ( 2 minutes 45.719 seconds)
[`Download Python source code: char_rnn_classification_tutorial.py`](../_downloads/ccb15f8365bdae22a0a019e57216d7c6/char_rnn_classification_tutorial.py)[`Download Jupyter notebook: char_rnn_classification_tutorial.ipynb`](../_downloads/977c14818c75427641ccb85ad21ed6dc/char_rnn_classification_tutorial.ipynb)
[Gallery generated by Sphinx-Gallery](https://sphinx-gallery.readthedocs.io)
# Generating Names with a Character-Level RNN
**Author**: [Sean Robertson](https://github.com/spro/practical-pytorch)
In the [last tutorial](char_rnn_classification_tutorial.html) we used a RNN to classify names into their language of origin. This time we’ll turn around and generate names from languages.
```py
> python sample.py Russian RUS
Rovakov
Uantov
Shavakov
> python sample.py German GER
Gerren
Ereng
Rosher
> python sample.py Spanish SPA
Salla
Parer
Allan
> python sample.py Chinese CHI
Chan
Hang
Iun
```
We are still hand-crafting a small RNN with a few linear layers. The big difference is instead of predicting a category after reading in all the letters of a name, we input a category and output one letter at a time. Recurrently predicting characters to form language (this could also be done with words or other higher order constructs) is often referred to as a “language model”.
**Recommended Reading:**
I assume you have at least installed PyTorch, know Python, and understand Tensors:
* [https://pytorch.org/](https://pytorch.org/) For installation instructions
* [Deep Learning with PyTorch: A 60 Minute Blitz](../beginner/deep_learning_60min_blitz.html) to get started with PyTorch in general
* [Learning PyTorch with Examples](../beginner/pytorch_with_examples.html) for a wide and deep overview
* [PyTorch for Former Torch Users](../beginner/former_torchies_tutorial.html) if you are former Lua Torch user
It would also be useful to know about RNNs and how they work:
* [The Unreasonable Effectiveness of Recurrent Neural Networks](https://karpathy.github.io/2015/05/21/rnn-effectiveness/) shows a bunch of real life examples
* [Understanding LSTM Networks](https://colah.github.io/posts/2015-08-Understanding-LSTMs/) is about LSTMs specifically but also informative about RNNs in general
I also suggest the previous tutorial, [Classifying Names with a Character-Level RNN](char_rnn_classification_tutorial.html)
## Preparing the Data
Note
Download the data from [here](https://download.pytorch.org/tutorial/data.zip) and extract it to the current directory.
See the last tutorial for more detail of this process. In short, there are a bunch of plain text files `data/names/[Language].txt` with a name per line. We split lines into an array, convert Unicode to ASCII, and end up with a dictionary `{language: [names ...]}`.
```py
from __future__ import unicode_literals, print_function, division
from io import open
import glob
import os
import unicodedata
import string
all_letters = string.ascii_letters + " .,;'-"
n_letters = len(all_letters) + 1 # Plus EOS marker
def findFiles(path): return glob.glob(path)
# Turn a Unicode string to plain ASCII, thanks to https://stackoverflow.com/a/518232/2809427
def unicodeToAscii(s):
return ''.join(
c for c in unicodedata.normalize('NFD', s)
if unicodedata.category(c) != 'Mn'
and c in all_letters
)
# Read a file and split into lines
def readLines(filename):
lines = open(filename, encoding='utf-8').read().strip().split('\n')
return [unicodeToAscii(line) for line in lines]
# Build the category_lines dictionary, a list of lines per category
category_lines = {}
all_categories = []
for filename in findFiles('data/names/*.txt'):
category = os.path.splitext(os.path.basename(filename))[0]
all_categories.append(category)
lines = readLines(filename)
category_lines[category] = lines
n_categories = len(all_categories)
if n_categories == 0:
raise RuntimeError('Data not found. Make sure that you downloaded data '
'from https://download.pytorch.org/tutorial/data.zip and extract it to '
'the current directory.')
print('# categories:', n_categories, all_categories)
print(unicodeToAscii("O'Néàl"))
```
Out:
```py
# categories: 18 ['Italian', 'German', 'Portuguese', 'Chinese', 'Greek', 'Polish', 'French', 'English', 'Spanish', 'Arabic', 'Czech', 'Russian', 'Irish', 'Dutch', 'Scottish', 'Vietnamese', 'Korean', 'Japanese']
O'Neal
```
## Creating the Network
This network extends [the last tutorial’s RNN](#Creating-the-Network) with an extra argument for the category tensor, which is concatenated along with the others. The category tensor is a one-hot vector just like the letter input.
We will interpret the output as the probability of the next letter. When sampling, the most likely output letter is used as the next input letter.
I added a second linear layer `o2o` (after combining hidden and output) to give it more muscle to work with. There’s also a dropout layer, which [randomly zeros parts of its input](https://arxiv.org/abs/1207.0580) with a given probability (here 0.1) and is usually used to fuzz inputs to prevent overfitting. Here we’re using it towards the end of the network to purposely add some chaos and increase sampling variety.
![](img/28a4f1426695fb55f1f6bc86278f6547.jpg)
```py
import torch
import torch.nn as nn
class RNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(RNN, self).__init__()
self.hidden_size = hidden_size
self.i2h = nn.Linear(n_categories + input_size + hidden_size, hidden_size)
self.i2o = nn.Linear(n_categories + input_size + hidden_size, output_size)
self.o2o = nn.Linear(hidden_size + output_size, output_size)
self.dropout = nn.Dropout(0.1)
self.softmax = nn.LogSoftmax(dim=1)
def forward(self, category, input, hidden):
input_combined = torch.cat((category, input, hidden), 1)
hidden = self.i2h(input_combined)
output = self.i2o(input_combined)
output_combined = torch.cat((hidden, output), 1)
output = self.o2o(output_combined)
output = self.dropout(output)
output = self.softmax(output)
return output, hidden
def initHidden(self):
return torch.zeros(1, self.hidden_size)
```
## Training
### Preparing for Training
First of all, helper functions to get random pairs of (category, line):
```py
import random
# Random item from a list
def randomChoice(l):
return l[random.randint(0, len(l) - 1)]
# Get a random category and random line from that category
def randomTrainingPair():
category = randomChoice(all_categories)
line = randomChoice(category_lines[category])
return category, line
```
For each timestep (that is, for each letter in a training word) the inputs of the network will be `(category, current letter, hidden state)` and the outputs will be `(next letter, next hidden state)`. So for each training set, we’ll need the category, a set of input letters, and a set of output/target letters.
Since we are predicting the next letter from the current letter for each timestep, the letter pairs are groups of consecutive letters from the line - e.g. for `"ABCD&lt;EOS&gt;"` we would create (“A”, “B”), (“B”, “C”), (“C”, “D”), (“D”, “EOS”).
![](img/3fae03d85aed3a2237fd4b2f7fb7b480.jpg)
The category tensor is a [one-hot tensor](https://en.wikipedia.org/wiki/One-hot) of size `&lt;1 x n_categories&gt;`. When training we feed it to the network at every timestep - this is a design choice, it could have been included as part of initial hidden state or some other strategy.
```py
# One-hot vector for category
def categoryTensor(category):
li = all_categories.index(category)
tensor = torch.zeros(1, n_categories)
tensor[0][li] = 1
return tensor
# One-hot matrix of first to last letters (not including EOS) for input
def inputTensor(line):
tensor = torch.zeros(len(line), 1, n_letters)
for li in range(len(line)):
letter = line[li]
tensor[li][0][all_letters.find(letter)] = 1
return tensor
# LongTensor of second letter to end (EOS) for target
def targetTensor(line):
letter_indexes = [all_letters.find(line[li]) for li in range(1, len(line))]
letter_indexes.append(n_letters - 1) # EOS
return torch.LongTensor(letter_indexes)
```
For convenience during training we’ll make a `randomTrainingExample` function that fetches a random (category, line) pair and turns them into the required (category, input, target) tensors.
```py
# Make category, input, and target tensors from a random category, line pair
def randomTrainingExample():
category, line = randomTrainingPair()
category_tensor = categoryTensor(category)
input_line_tensor = inputTensor(line)
target_line_tensor = targetTensor(line)
return category_tensor, input_line_tensor, target_line_tensor
```
### Training the Network
In contrast to classification, where only the last output is used, we are making a prediction at every step, so we are calculating loss at every step.
The magic of autograd allows you to simply sum these losses at each step and call backward at the end.
```py
criterion = nn.NLLLoss()
learning_rate = 0.0005
def train(category_tensor, input_line_tensor, target_line_tensor):
target_line_tensor.unsqueeze_(-1)
hidden = rnn.initHidden()
rnn.zero_grad()
loss = 0
for i in range(input_line_tensor.size(0)):
output, hidden = rnn(category_tensor, input_line_tensor[i], hidden)
l = criterion(output, target_line_tensor[i])
loss += l
loss.backward()
for p in rnn.parameters():
p.data.add_(-learning_rate, p.grad.data)
return output, loss.item() / input_line_tensor.size(0)
```
To keep track of how long training takes I am adding a `timeSince(timestamp)` function which returns a human readable string:
```py
import time
import math
def timeSince(since):
now = time.time()
s = now - since
m = math.floor(s / 60)
s -= m * 60
return '%dm %ds' % (m, s)
```
Training is business as usual - call train a bunch of times and wait a few minutes, printing the current time and loss every `print_every` examples, and keeping store of an average loss per `plot_every` examples in `all_losses` for plotting later.
```py
rnn = RNN(n_letters, 128, n_letters)
n_iters = 100000
print_every = 5000
plot_every = 500
all_losses = []
total_loss = 0 # Reset every plot_every iters
start = time.time()
for iter in range(1, n_iters + 1):
output, loss = train(*randomTrainingExample())
total_loss += loss
if iter % print_every == 0:
print('%s (%d %d%%) %.4f' % (timeSince(start), iter, iter / n_iters * 100, loss))
if iter % plot_every == 0:
all_losses.append(total_loss / plot_every)
total_loss = 0
```
Out:
```py
0m 21s (5000 5%) 2.5152
0m 43s (10000 10%) 2.7758
1m 4s (15000 15%) 2.2884
1m 25s (20000 20%) 3.2404
1m 47s (25000 25%) 2.7298
2m 8s (30000 30%) 3.4301
2m 29s (35000 35%) 2.2306
2m 51s (40000 40%) 2.5628
3m 12s (45000 45%) 1.7700
3m 34s (50000 50%) 2.4657
3m 55s (55000 55%) 2.1909
4m 16s (60000 60%) 2.1004
4m 38s (65000 65%) 2.3524
4m 59s (70000 70%) 2.3339
5m 21s (75000 75%) 2.3936
5m 42s (80000 80%) 2.1886
6m 3s (85000 85%) 2.0739
6m 25s (90000 90%) 2.5451
6m 46s (95000 95%) 1.5104
7m 7s (100000 100%) 2.4600
```
### Plotting the Losses
Plotting the historical loss from all_losses shows the network learning:
```py
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
plt.figure()
plt.plot(all_losses)
```
![https://pytorch.org/tutorials/_images/sphx_glr_char_rnn_generation_tutorial_001.png](img/5ad82e2b23a82287af2caa2fe4b316b3.jpg)
## Sampling the Network
To sample we give the network a letter and ask what the next one is, feed that in as the next letter, and repeat until the EOS token.
* Create tensors for input category, starting letter, and empty hidden state
* Create a string `output_name` with the starting letter
* Up to a maximum output length,
* Feed the current letter to the network
* Get the next letter from highest output, and next hidden state
* If the letter is EOS, stop here
* If a regular letter, add to `output_name` and continue
* Return the final name
Note
Rather than having to give it a starting letter, another strategy would have been to include a “start of string” token in training and have the network choose its own starting letter.
```py
max_length = 20
# Sample from a category and starting letter
def sample(category, start_letter='A'):
with torch.no_grad(): # no need to track history in sampling
category_tensor = categoryTensor(category)
input = inputTensor(start_letter)
hidden = rnn.initHidden()
output_name = start_letter
for i in range(max_length):
output, hidden = rnn(category_tensor, input[0], hidden)
topv, topi = output.topk(1)
topi = topi[0][0]
if topi == n_letters - 1:
break
else:
letter = all_letters[topi]
output_name += letter
input = inputTensor(letter)
return output_name
# Get multiple samples from one category and multiple starting letters
def samples(category, start_letters='ABC'):
for start_letter in start_letters:
print(sample(category, start_letter))
samples('Russian', 'RUS')
samples('German', 'GER')
samples('Spanish', 'SPA')
samples('Chinese', 'CHI')
```
Out:
```py
Rovanik
Uakilovev
Shaveri
Garter
Eren
Romer
Santa
Parera
Artera
Chan
Ha
Iua
```
## Exercises
* Try with a different dataset of category -&gt; line, for example:
* Fictional series -&gt; Character name
* Part of speech -&gt; Word
* Country -&gt; City
* Use a “start of sentence” token so that sampling can be done without choosing a start letter
* Get better results with a bigger and/or better shaped network
* Try the nn.LSTM and nn.GRU layers
* Combine multiple of these RNNs as a higher level network
**Total running time of the script:** ( 7 minutes 7.943 seconds)
[`Download Python source code: char_rnn_generation_tutorial.py`](../_downloads/8167177b6dd8ddf05bb9fe58744ac406/char_rnn_generation_tutorial.py)[`Download Jupyter notebook: char_rnn_generation_tutorial.ipynb`](../_downloads/a35c00bb5afae3962e1e7869c66872fa/char_rnn_generation_tutorial.ipynb)
[Gallery generated by Sphinx-Gallery](https://sphinx-gallery.readthedocs.io)
此差异已折叠。
# Loading a PyTorch Model in C++
As its name suggests, the primary interface to PyTorch is the Python programming language. While Python is a suitable and preferred language for many scenarios requiring dynamism and ease of iteration, there are equally many situations where precisely these properties of Python are unfavorable. One environment in which the latter often applies is _production_ – the land of low latencies and strict deployment requirements. For production scenarios, C++ is very often the language of choice, even if only to bind it into another language like Java, Rust or Go. The following paragraphs will outline the path PyTorch provides to go from an existing Python model to a serialized representation that can be _loaded_ and _executed_ purely from C++, with no dependency on Python.
## Step 1: Converting Your PyTorch Model to Torch Script
A PyTorch model’s journey from Python to C++ is enabled by [Torch Script](https://pytorch.org/docs/master/jit.html), a representation of a PyTorch model that can be understood, compiled and serialized by the Torch Script compiler. If you are starting out from an existing PyTorch model written in the vanilla “eager” API, you must first convert your model to Torch Script. In the most common cases, discussed below, this requires only little effort. If you already have a Torch Script module, you can skip to the next section of this tutorial.
There exist two ways of converting a PyTorch model to Torch Script. The first is known as _tracing_, a mechanism in which the structure of the model is captured by evaluating it once using example inputs, and recording the flow of those inputs through the model. This is suitable for models that make limited use of control flow. The second approach is to add explicit annotations to your model that inform the Torch Script compiler that it may directly parse and compile your model code, subject to the constraints imposed by the Torch Script language.
Tip
You can find the complete documentation for both of these methods, as well as further guidance on which to use, in the official [Torch Script reference](https://pytorch.org/docs/master/jit.html).
### Converting to Torch Script via Tracing
To convert a PyTorch model to Torch Script via tracing, you must pass an instance of your model along with an example input to the `torch.jit.trace` function. This will produce a `torch.jit.ScriptModule` object with the trace of your model evaluation embedded in the module’s `forward` method:
```py
import torch
import torchvision
# An instance of your model.
model = torchvision.models.resnet18()
# An example input you would normally provide to your model's forward() method.
example = torch.rand(1, 3, 224, 224)
# Use torch.jit.trace to generate a torch.jit.ScriptModule via tracing.
traced_script_module = torch.jit.trace(model, example)
```
The traced `ScriptModule` can now be evaluated identically to a regular PyTorch module:
```py
In[1]: output = traced_script_module(torch.ones(1, 3, 224, 224))
In[2]: output[0, :5]
Out[2]: tensor([-0.2698, -0.0381, 0.4023, -0.3010, -0.0448], grad_fn=<SliceBackward>)
```
### Converting to Torch Script via Annotation
Under certain circumstances, such as if your model employs particular forms of control flow, you may want to write your model in Torch Script directly and annotate your model accordingly. For example, say you have the following vanilla Pytorch model:
```py
import torch
class MyModule(torch.nn.Module):
def __init__(self, N, M):
super(MyModule, self).__init__()
self.weight = torch.nn.Parameter(torch.rand(N, M))
def forward(self, input):
if input.sum() > 0:
output = self.weight.mv(input)
else:
output = self.weight + input
return output
```
Because the `forward` method of this module uses control flow that is dependent on the input, it is not suitable for tracing. Instead, we can convert it to a `ScriptModule` by subclassing it from `torch.jit.ScriptModule` and adding a `@torch.jit.script_method` annotation to the model’s `forward` method:
```py
import torch
class MyModule(torch.jit.ScriptModule):
def __init__(self, N, M):
super(MyModule, self).__init__()
self.weight = torch.nn.Parameter(torch.rand(N, M))
@torch.jit.script_method
def forward(self, input):
if input.sum() > 0:
output = self.weight.mv(input)
else:
output = self.weight + input
return output
my_script_module = MyModule()
```
Creating a new `MyModule` object now directly produces an instance of `ScriptModule` that is ready for serialization.
## Step 2: Serializing Your Script Module to a File
Once you have a `ScriptModule` in your hands, either from tracing or annotating a PyTorch model, you are ready to serialize it to a file. Later on, you’ll be able to load the module from this file in C++ and execute it without any dependency on Python. Say we want to serialize the `ResNet18` model shown earlier in the tracing example. To perform this serialization, simply call [save](https://pytorch.org/docs/master/jit.html#torch.jit.ScriptModule.save) on the module and pass it a filename:
```py
traced_script_module.save("model.pt")
```
This will produce a `model.pt` file in your working directory. We have now officially left the realm of Python and are ready to cross over to the sphere of C++.
## Step 3: Loading Your Script Module in C++
To load your serialized PyTorch model in C++, your application must depend on the PyTorch C++ API – also known as _LibTorch_. The LibTorch distribution encompasses a collection of shared libraries, header files and CMake build configuration files. While CMake is not a requirement for depending on LibTorch, it is the recommended approach and will be well supported into the future. For this tutorial, we will be building a minimal C++ application using CMake and LibTorch that simply loads and executes a serialized PyTorch model.
### A Minimal C++ Application
Let’s begin by discussing the code to load a module. The following will already do:
```py
#include <torch/script.h> // One-stop header.
#include <iostream>
#include <memory>
int main(int argc, const char* argv[]) {
if (argc != 2) {
std::cerr << "usage: example-app <path-to-exported-script-module>\n";
return -1;
}
// Deserialize the ScriptModule from a file using torch::jit::load().
std::shared_ptr<torch::jit::script::Module> module = torch::jit::load(argv[1]);
assert(module != nullptr);
std::cout << "ok\n";
}
```
The `&lt;torch/script.h&gt;` header encompasses all relevant includes from the LibTorch library necessary to run the example. Our application accepts the file path to a serialized PyTorch `ScriptModule` as its only command line argument and then proceeds to deserialize the module using the `torch::jit::load()` function, which takes this file path as input. In return we receive a shared pointer to a `torch::jit::script::Module`, the equivalent to a `torch.jit.ScriptModule` in C++. For now, we only verify that this pointer is not null. We will examine how to execute it in a moment.
### Depending on LibTorch and Building the Application
Assume we stored the above code into a file called `example-app.cpp`. A minimal `CMakeLists.txt` to build it could look as simple as:
```py
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(custom_ops)
find_package(Torch REQUIRED)
add_executable(example-app example-app.cpp)
target_link_libraries(example-app "${TORCH_LIBRARIES}")
set_property(TARGET example-app PROPERTY CXX_STANDARD 11)
```
The last thing we need to build the example application is the LibTorch distribution. You can always grab the latest stable release from the [download page](https://pytorch.org/) on the PyTorch website. If you download and unzip the latest archive, you should receive a folder with the following directory structure:
```py
libtorch/
bin/
include/
lib/
share/
```
* The `lib/` folder contains the shared libraries you must link against,
* The `include/` folder contains header files your program will need to include,
* The `share/` folder contains the necessary CMake configuration to enable the simple `find_package(Torch)` command above.
The last step is building the application. For this, assume our example directory is laid out like this:
```py
example-app/
CMakeLists.txt
example-app.cpp
```
We can now run the following commands to build the application from within the `example-app/` folder:
```py
mkdir build
cd build
cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
make
```
where `/path/to/libtorch` should be the full path to the unzipped LibTorch distribution. If all goes well, it will look something like this:
```py
root@4b5a67132e81:/example-app# mkdir build
root@4b5a67132e81:/example-app# cd build
root@4b5a67132e81:/example-app/build# cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Configuring done
-- Generating done
-- Build files have been written to: /example-app/build
root@4b5a67132e81:/example-app/build# make
Scanning dependencies of target example-app
[ 50%] Building CXX object CMakeFiles/example-app.dir/example-app.cpp.o
[100%] Linking CXX executable example-app
[100%] Built target example-app
```
If we supply the path to the serialized `ResNet18` model we created earlier to the resulting `example-app` binary, we should be rewarded with a friendly “ok”:
```py
root@4b5a67132e81:/example-app/build# ./example-app model.pt
ok
```
## Step 4: Executing the Script Module in C++
Having successfully loaded our serialized `ResNet18` in C++, we are now just a couple lines of code away from executing it! Let’s add those lines to our C++ application’s `main()` function:
```py
// Create a vector of inputs.
std::vector<torch::jit::IValue> inputs;
inputs.push_back(torch::ones({1, 3, 224, 224}));
// Execute the model and turn its output into a tensor.
at::Tensor output = module->forward(inputs).toTensor();
std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';
```
The first two lines set up the inputs to our model. We create a vector of `torch::jit::IValue` (a type-erased value type `script::Module` methods accept and return) and add a single input. To create the input tensor, we use `torch::ones()`, the equivalent to `torch.ones` in the C++ API. We then run the `script::Module`’s `forward` method, passing it the input vector we created. In return we get a new `IValue`, which we convert to a tensor by calling `toTensor()`.
Tip
To learn more about functions like `torch::ones` and the PyTorch C++ API in general, refer to its documentation at [https://pytorch.org/cppdocs](https://pytorch.org/cppdocs). The PyTorch C++ API provides near feature parity with the Python API, allowing you to further manipulate and process tensors just like in Python.
In the last line, we print the first five entries of the output. Since we supplied the same input to our model in Python earlier in this tutorial, we should ideally see the same output. Let’s try it out by re-compiling our application and running it with the same serialized model:
```py
root@4b5a67132e81:/example-app/build# make
Scanning dependencies of target example-app
[ 50%] Building CXX object CMakeFiles/example-app.dir/example-app.cpp.o
[100%] Linking CXX executable example-app
[100%] Built target example-app
root@4b5a67132e81:/example-app/build# ./example-app model.pt
-0.2698 -0.0381 0.4023 -0.3010 -0.0448
[ Variable[CPUFloatType]{1,5} ]
```
For reference, the output in Python previously was:
```py
tensor([-0.2698, -0.0381, 0.4023, -0.3010, -0.0448], grad_fn=<SliceBackward>)
```
Looks like a good match!
Tip
To move your model to GPU memory, you can write `model-&gt;to(at::kCUDA);`. Make sure the inputs to a model living in CUDA memory are also in CUDA memory by calling `tensor.to(at::kCUDA)`, which will return a new tensor in CUDA memory.
## Step 5: Getting Help and Exploring the API
This tutorial has hopefully equipped you with a general understanding of a PyTorch model’s path from Python to C++. With the concepts described in this tutorial, you should be able to go from a vanilla, “eager” PyTorch model, to a compiled `ScriptModule` in Python, to a serialized file on disk and – to close the loop – to an executable `script::Module` in C++.
Of course, there are many concepts we did not cover. For example, you may find yourself wanting to extend your `ScriptModule` with a custom operator implemented in C++ or CUDA, and executing this custom operator inside your `ScriptModule` loaded in your pure C++ production environment. The good news is: this is possible, and well supported! For now, you can explore [this](https://github.com/pytorch/pytorch/tree/master/test/custom_operator) folder for examples, and we will follow up with a tutorial shortly. In the time being, the following links may be generally helpful:
* The Torch Script reference: [https://pytorch.org/docs/master/jit.html](https://pytorch.org/docs/master/jit.html)
* The PyTorch C++ API documentation: [https://pytorch.org/cppdocs/](https://pytorch.org/cppdocs/)
* The PyTorch Python API documentation: [https://pytorch.org/docs/](https://pytorch.org/docs/)
As always, if you run into any problems or have questions, you can use our [forum](https://discuss.pytorch.org/) or [GitHub issues](https://github.com/pytorch/pytorch/issues) to get in touch.
此差异已折叠。
此差异已折叠。
# Data Loading and Processing Tutorial
**Author**: [Sasank Chilamkurthy](https://chsasank.github.io)
A lot of effort in solving any machine learning problem goes in to preparing the data. PyTorch provides many tools to make data loading easy and hopefully, to make your code more readable. In this tutorial, we will see how to load and preprocess/augment data from a non trivial dataset.
To run this tutorial, please make sure the following packages are installed:
* `scikit-image`: For image io and transforms
* `pandas`: For easier csv parsing
```py
from __future__ import print_function, division
import os
import torch
import pandas as pd
from skimage import io, transform
import numpy as np
import matplotlib.pyplot as plt
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, utils
# Ignore warnings
import warnings
warnings.filterwarnings("ignore")
plt.ion() # interactive mode
```
The dataset we are going to deal with is that of facial pose. This means that a face is annotated like this:
[![https://pytorch.org/tutorials/_images/landmarked_face2.png](img/a9d4cfeae43b1acb77f9175122955f26.jpg)](https://pytorch.org/tutorials/_images/landmarked_face2.png)
Over all, 68 different landmark points are annotated for each face.
Note
Download the dataset from [here](https://download.pytorch.org/tutorial/faces.zip) so that the images are in a directory named ‘data/faces/’. This dataset was actually generated by applying excellent [dlib’s pose estimation](https://blog.dlib.net/2014/08/real-time-face-pose-estimation.html) on a few images from imagenet tagged as ‘face’.
Dataset comes with a csv file with annotations which looks like this:
```py
image_name,part_0_x,part_0_y,part_1_x,part_1_y,part_2_x, ... ,part_67_x,part_67_y
0805personali01.jpg,27,83,27,98, ... 84,134
1084239450_e76e00b7e7.jpg,70,236,71,257, ... ,128,312
```
Let’s quickly read the CSV and get the annotations in an (N, 2) array where N is the number of landmarks.
```py
landmarks_frame = pd.read_csv('data/faces/face_landmarks.csv')
n = 65
img_name = landmarks_frame.iloc[n, 0]
landmarks = landmarks_frame.iloc[n, 1:].as_matrix()
landmarks = landmarks.astype('float').reshape(-1, 2)
print('Image name: {}'.format(img_name))
print('Landmarks shape: {}'.format(landmarks.shape))
print('First 4 Landmarks: {}'.format(landmarks[:4]))
```
Out:
```py
Image name: person-7.jpg
Landmarks shape: (68, 2)
First 4 Landmarks: [[32\. 65.]
[33\. 76.]
[34\. 86.]
[34\. 97.]]
```
Let’s write a simple helper function to show an image and its landmarks and use it to show a sample.
```py
def show_landmarks(image, landmarks):
"""Show image with landmarks"""
plt.imshow(image)
plt.scatter(landmarks[:, 0], landmarks[:, 1], s=10, marker='.', c='r')
plt.pause(0.001) # pause a bit so that plots are updated
plt.figure()
show_landmarks(io.imread(os.path.join('data/faces/', img_name)),
landmarks)
plt.show()
```
![https://pytorch.org/tutorials/_images/sphx_glr_data_loading_tutorial_001.png](img/c6b4a228070733b782a708c471defe4a.jpg)
## Dataset class
`torch.utils.data.Dataset` is an abstract class representing a dataset. Your custom dataset should inherit `Dataset` and override the following methods:
* `__len__` so that `len(dataset)` returns the size of the dataset.
* `__getitem__` to support the indexing such that `dataset[i]` can be used to get `\(i\)`th sample
Let’s create a dataset class for our face landmarks dataset. We will read the csv in `__init__` but leave the reading of images to `__getitem__`. This is memory efficient because all the images are not stored in the memory at once but read as required.
Sample of our dataset will be a dict `{'image': image, 'landmarks': landmarks}`. Our datset will take an optional argument `transform` so that any required processing can be applied on the sample. We will see the usefulness of `transform` in the next section.
```py
class FaceLandmarksDataset(Dataset):
"""Face Landmarks dataset."""
def __init__(self, csv_file, root_dir, transform=None):
"""
Args:
csv_file (string): Path to the csv file with annotations.
root_dir (string): Directory with all the images.
transform (callable, optional): Optional transform to be applied
on a sample.
"""
self.landmarks_frame = pd.read_csv(csv_file)
self.root_dir = root_dir
self.transform = transform
def __len__(self):
return len(self.landmarks_frame)
def __getitem__(self, idx):
img_name = os.path.join(self.root_dir,
self.landmarks_frame.iloc[idx, 0])
image = io.imread(img_name)
landmarks = self.landmarks_frame.iloc[idx, 1:].as_matrix()
landmarks = landmarks.astype('float').reshape(-1, 2)
sample = {'image': image, 'landmarks': landmarks}
if self.transform:
sample = self.transform(sample)
return sample
```
Let’s instantiate this class and iterate through the data samples. We will print the sizes of first 4 samples and show their landmarks.
```py
face_dataset = FaceLandmarksDataset(csv_file='data/faces/face_landmarks.csv',
root_dir='data/faces/')
fig = plt.figure()
for i in range(len(face_dataset)):
sample = face_dataset[i]
print(i, sample['image'].shape, sample['landmarks'].shape)
ax = plt.subplot(1, 4, i + 1)
plt.tight_layout()
ax.set_title('Sample #{}'.format(i))
ax.axis('off')
show_landmarks(**sample)
if i == 3:
plt.show()
break
```
![https://pytorch.org/tutorials/_images/sphx_glr_data_loading_tutorial_002.png](img/80c0f612ddf710842d4cc31ee3c78da3.jpg)
Out:
```py
0 (324, 215, 3) (68, 2)
1 (500, 333, 3) (68, 2)
2 (250, 258, 3) (68, 2)
3 (434, 290, 3) (68, 2)
```
## Transforms
One issue we can see from the above is that the samples are not of the same size. Most neural networks expect the images of a fixed size. Therefore, we will need to write some prepocessing code. Let’s create three transforms:
* `Rescale`: to scale the image
* `RandomCrop`: to crop from image randomly. This is data augmentation.
* `ToTensor`: to convert the numpy images to torch images (we need to swap axes).
We will write them as callable classes instead of simple functions so that parameters of the transform need not be passed everytime it’s called. For this, we just need to implement `__call__` method and if required, `__init__` method. We can then use a transform like this:
```py
tsfm = Transform(params)
transformed_sample = tsfm(sample)
```
Observe below how these transforms had to be applied both on the image and landmarks.
```py
class Rescale(object):
"""Rescale the image in a sample to a given size.
Args:
output_size (tuple or int): Desired output size. If tuple, output is
matched to output_size. If int, smaller of image edges is matched
to output_size keeping aspect ratio the same.
"""
def __init__(self, output_size):
assert isinstance(output_size, (int, tuple))
self.output_size = output_size
def __call__(self, sample):
image, landmarks = sample['image'], sample['landmarks']
h, w = image.shape[:2]
if isinstance(self.output_size, int):
if h > w:
new_h, new_w = self.output_size * h / w, self.output_size
else:
new_h, new_w = self.output_size, self.output_size * w / h
else:
new_h, new_w = self.output_size
new_h, new_w = int(new_h), int(new_w)
img = transform.resize(image, (new_h, new_w))
# h and w are swapped for landmarks because for images,
# x and y axes are axis 1 and 0 respectively
landmarks = landmarks * [new_w / w, new_h / h]
return {'image': img, 'landmarks': landmarks}
class RandomCrop(object):
"""Crop randomly the image in a sample.
Args:
output_size (tuple or int): Desired output size. If int, square crop
is made.
"""
def __init__(self, output_size):
assert isinstance(output_size, (int, tuple))
if isinstance(output_size, int):
self.output_size = (output_size, output_size)
else:
assert len(output_size) == 2
self.output_size = output_size
def __call__(self, sample):
image, landmarks = sample['image'], sample['landmarks']
h, w = image.shape[:2]
new_h, new_w = self.output_size
top = np.random.randint(0, h - new_h)
left = np.random.randint(0, w - new_w)
image = image[top: top + new_h,
left: left + new_w]
landmarks = landmarks - [left, top]
return {'image': image, 'landmarks': landmarks}
class ToTensor(object):
"""Convert ndarrays in sample to Tensors."""
def __call__(self, sample):
image, landmarks = sample['image'], sample['landmarks']
# swap color axis because
# numpy image: H x W x C
# torch image: C X H X W
image = image.transpose((2, 0, 1))
return {'image': torch.from_numpy(image),
'landmarks': torch.from_numpy(landmarks)}
```
### Compose transforms
Now, we apply the transforms on an sample.
Let’s say we want to rescale the shorter side of the image to 256 and then randomly crop a square of size 224 from it. i.e, we want to compose `Rescale` and `RandomCrop` transforms. `torchvision.transforms.Compose` is a simple callable class which allows us to do this.
```py
scale = Rescale(256)
crop = RandomCrop(128)
composed = transforms.Compose([Rescale(256),
RandomCrop(224)])
# Apply each of the above transforms on sample.
fig = plt.figure()
sample = face_dataset[65]
for i, tsfrm in enumerate([scale, crop, composed]):
transformed_sample = tsfrm(sample)
ax = plt.subplot(1, 3, i + 1)
plt.tight_layout()
ax.set_title(type(tsfrm).__name__)
show_landmarks(**transformed_sample)
plt.show()
```
![https://pytorch.org/tutorials/_images/sphx_glr_data_loading_tutorial_003.png](img/968cafa6f1b4c8e71a47c64ae7d2a72d.jpg)
## Iterating through the dataset
Let’s put this all together to create a dataset with composed transforms. To summarize, every time this dataset is sampled:
* An image is read from the file on the fly
* Transforms are applied on the read image
* Since one of the transforms is random, data is augmentated on sampling
We can iterate over the created dataset with a `for i in range` loop as before.
```py
transformed_dataset = FaceLandmarksDataset(csv_file='data/faces/face_landmarks.csv',
root_dir='data/faces/',
transform=transforms.Compose([
Rescale(256),
RandomCrop(224),
ToTensor()
]))
for i in range(len(transformed_dataset)):
sample = transformed_dataset[i]
print(i, sample['image'].size(), sample['landmarks'].size())
if i == 3:
break
```
Out:
```py
0 torch.Size([3, 224, 224]) torch.Size([68, 2])
1 torch.Size([3, 224, 224]) torch.Size([68, 2])
2 torch.Size([3, 224, 224]) torch.Size([68, 2])
3 torch.Size([3, 224, 224]) torch.Size([68, 2])
```
However, we are losing a lot of features by using a simple `for` loop to iterate over the data. In particular, we are missing out on:
* Batching the data
* Shuffling the data
* Load the data in parallel using `multiprocessing` workers.
`torch.utils.data.DataLoader` is an iterator which provides all these features. Parameters used below should be clear. One parameter of interest is `collate_fn`. You can specify how exactly the samples need to be batched using `collate_fn`. However, default collate should work fine for most use cases.
```py
dataloader = DataLoader(transformed_dataset, batch_size=4,
shuffle=True, num_workers=4)
# Helper function to show a batch
def show_landmarks_batch(sample_batched):
"""Show image with landmarks for a batch of samples."""
images_batch, landmarks_batch = \
sample_batched['image'], sample_batched['landmarks']
batch_size = len(images_batch)
im_size = images_batch.size(2)
grid = utils.make_grid(images_batch)
plt.imshow(grid.numpy().transpose((1, 2, 0)))
for i in range(batch_size):
plt.scatter(landmarks_batch[i, :, 0].numpy() + i * im_size,
landmarks_batch[i, :, 1].numpy(),
s=10, marker='.', c='r')
plt.title('Batch from dataloader')
for i_batch, sample_batched in enumerate(dataloader):
print(i_batch, sample_batched['image'].size(),
sample_batched['landmarks'].size())
# observe 4th batch and stop.
if i_batch == 3:
plt.figure()
show_landmarks_batch(sample_batched)
plt.axis('off')
plt.ioff()
plt.show()
break
```
![https://pytorch.org/tutorials/_images/sphx_glr_data_loading_tutorial_004.png](img/f12c0231a67af28c3057e0ed3fa7f993.jpg)
Out:
```py
0 torch.Size([4, 3, 224, 224]) torch.Size([4, 68, 2])
1 torch.Size([4, 3, 224, 224]) torch.Size([4, 68, 2])
2 torch.Size([4, 3, 224, 224]) torch.Size([4, 68, 2])
3 torch.Size([4, 3, 224, 224]) torch.Size([4, 68, 2])
```
## Afterword: torchvision
In this tutorial, we have seen how to write and use datasets, transforms and dataloader. `torchvision` package provides some common datasets and transforms. You might not even have to write custom classes. One of the more generic datasets available in torchvision is `ImageFolder`. It assumes that images are organized in the following way:
```py
root/ants/xxx.png
root/ants/xxy.jpeg
root/ants/xxz.png
.
.
.
root/bees/123.jpg
root/bees/nsdf3.png
root/bees/asd932_.png
```
where ‘ants’, ‘bees’ etc. are class labels. Similarly generic transforms which operate on `PIL.Image` like `RandomHorizontalFlip`, `Scale`, are also available. You can use these to write a dataloader like this:
```py
import torch
from torchvision import transforms, datasets
data_transform = transforms.Compose([
transforms.RandomSizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
hymenoptera_dataset = datasets.ImageFolder(root='hymenoptera_data/train',
transform=data_transform)
dataset_loader = torch.utils.data.DataLoader(hymenoptera_dataset,
batch_size=4, shuffle=True,
num_workers=4)
```
For an example with training code, please see [Transfer Learning Tutorial](transfer_learning_tutorial.html).
**Total running time of the script:** ( 0 minutes 58.325 seconds)
[`Download Python source code: data_loading_tutorial.py`](../_downloads/0daab3cdf9be9579bd736e92d8de3917/data_loading_tutorial.py)[`Download Jupyter notebook: data_loading_tutorial.ipynb`](../_downloads/21adbaecd47a412f8143afb1c48f05a6/data_loading_tutorial.ipynb)
[Gallery generated by Sphinx-Gallery](https://sphinx-gallery.readthedocs.io)
此差异已折叠。
# Deep Learning with PyTorch: A 60 Minute Blitz
**Author**: [Soumith Chintala](http://soumith.ch)
Goal of this tutorial:
* Understand PyTorch’s Tensor library and neural networks at a high level.
* Train a small neural network to classify images
_This tutorial assumes that you have a basic familiarity of numpy_
Note
Make sure you have the [torch](https://github.com/pytorch/pytorch) and [torchvision](https://github.com/pytorch/vision) packages installed.
![https://pytorch.org/tutorials/_images/tensor_illustration_flat.png](img/0c7a402331744a44f5e17575b1607904.jpg)
[What is PyTorch?](blitz/tensor_tutorial.html#sphx-glr-beginner-blitz-tensor-tutorial-py)
![https://pytorch.org/tutorials/_images/autodiff.png](img/0a7a97c39d6dfc0e08d2701eb7a49231.jpg)
[Autograd: Automatic Differentiation](blitz/autograd_tutorial.html#sphx-glr-beginner-blitz-autograd-tutorial-py)
![https://pytorch.org/tutorials/_images/mnist1.png](img/be60e8e1f4baa0de87cf9d37c5325525.jpg)
[Neural Networks](blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py)
![https://pytorch.org/tutorials/_images/cifar101.png](img/7a28f697e6bab9f3d9b1e8da4a5a5249.jpg)
[Training a Classifier](blitz/cifar10_tutorial.html#sphx-glr-beginner-blitz-cifar10-tutorial-py)
![https://pytorch.org/tutorials/_images/data_parallel.png](img/c699a36b37c0fd5aec258278788c1216.jpg)
[Optional: Data Parallelism](blitz/data_parallel_tutorial.html#sphx-glr-beginner-blitz-data-parallel-tutorial-py)
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册