2021-01-16 21:30:47

95bea221 · wizardforcel · 8d9884e7 · 95bea221
隐藏空白更改
内联并排

Showing with 24 addition and 24 deletion

new/dl-pt-workshop/6.md new/dl-pt-workshop/6.md +24 -24

未找到文件。
--- a/new/dl-pt-workshop/6.md
+++ b/new/dl-pt-workshop/6.md
@@ -32,7 +32,7 @@

 这是通过使用 RNN 可以执行的不同任务的一些简要说明：

-*   **NLP**: This refers to the ability of machines to represent human language. This is perhaps one of the most explored areas of deep learning and undoubtedly the preferred data problem when making use of RNNs. The idea is to train the network using text as input data, such as poems and books, among others, with the objective of creating a model that is capable of generating such texts.
+*   **NLP**。这指的是机器代表人类语言的能力。这可能是深度学习中探索最多的领域之一，也无疑是利用RNN时首选的数据问题。其思路是以文本作为输入数据来训练网络，如诗词和书籍等，目的是创建一个能够生成此类文本的模型。

    NLP 通常用于创建聊天机器人（虚拟助手）。 通过从以前的人类对话中学习，NLP 模型能够帮助一个人解决常见问题或疑问。 因此，他们的句子表达能力仅限于他们在训练过程中所学到的内容，这意味着他们只能回答所学的内容。

@@ -42,7 +42,7 @@

 图 6.1：Facebook 的 Messenger 聊天机器人

-*   **Speech recognition**: Similar to NLP, speech recognition attempts to understand and represent human language. However, the difference here is that the former (NLP) is trained on and produces the output in the form of text, while the latter (speech recognition) uses audio clips. With the proliferation of developments in this field and the interest of big companies, these models are capable of understanding different languages and even different accents and pronunciation.
+*   **语音识别**。与NLP类似，语音识别试图理解和表示人类语言。然而，这里的区别在于，前者（NLP）是以文本的形式进行训练并产生输出，而后者（语音识别）则使用音频片段。随着这一领域的发展，以及大公司的兴趣，这些模型能够理解不同的语言，甚至不同的口音和发音。

    语音识别设备的一个流行示例是 Alexa –来自亚马逊的语音激活虚拟助手模型：

@@ -50,7 +50,7 @@

 图 6.2：亚马逊的 Alexa

-*   **Machine translation**: This refers to a machine's ability to translate human languages effectively. According to this, the input is the source language (for instance, Spanish) and the output is the target language (for instance, English). The main difference between NLP and machine translation is that, in the latter, the output is built after the entire input has been fed to the model.
+*   **机器翻译**。这是指机器有效翻译人类语言的能力。据此，输入是源语言（如西班牙语），输出是目标语言（如英语）。NLP与机器翻译的主要区别在于，在后者中，输出是在将整个输入输入输入到模型后建立的。

    随着全球化的兴起和当今休闲旅行的普及，人们需要使用多种语言。 因此，出现了能够在不同语言之间进行翻译的设备。 该领域的最新成果之一是 Google 的 Pixel Buds，它可以实时执行翻译：

@@ -58,7 +58,7 @@

 图 6.3：Google 的像素芽

-*   **Time-series forecasting**: A less popular application of an RNN is the prediction of a sequence of data points in the future based on historical data. RNNs are particularly good at this task due to their ability to retain an internal memory, which allows time-series analysis to consider the different timesteps in the past to perform a prediction or a series of predictions in the future.
+*   **时间序列预测**。RNN的一个不太流行的应用是根据历史数据预测未来的数据点序列。由于RNN能够保留内部记忆，使时间序列分析能够考虑过去的不同时间段来进行未来的预测或一系列预测，因此RNN特别擅长这项任务。

    这通常用于预测未来的收入或需求，这有助于公司为不同的情况做好准备。 下图显示了每月销售额的预测：

@@ -132,7 +132,7 @@

 对于本章中的练习和活动，您需要在本地计算机上安装 Python 3.7，Jupyter 6.0，Matplotlib 3.1，NumPy 1.17，Pandas 0.25 和 PyTorch 1.3+（最好是 PyTorch 1.4）。

-1.  Import the following libraries:
+1.  导入以下库：

    ```py
    import pandas as pd
@@ -140,7 +140,7 @@
    import torch
    ```

-2.  Create a Pandas DataFrame that's 10 x 5 in size, filled with random numbers ranging from 0 to 100\. Name the five columns as follows: **["Week1", "Week2", "Week3", "Week4", "Week5"]**.
+2.  创建一个`10×5`大小的Pandas DataFrame，里面充满了从0到100的随机数。命名五列如下：`["Week1", "Week2", "Week3", "Week4", "Week5"]`。

    确保将随机种子设置为`0`，以便能够重现本书中显示的结果：

@@ -162,7 +162,7 @@

    图 6.10：创建的 DataFrame

-3.  Create an input and a target variable, considering that the input variable should contain all the values of all the instances, except the last column of data. The target variable should contain all the values of all the instances, except the first column:
+3.  创建一个输入变量和一个目标变量，考虑到输入变量应该包含所有实例的所有值，除了最后一列数据。目标变量应包含所有实例的所有值，但第一列数据除外。

    ```py
    inputs = data.iloc[:,:-1]
@@ -170,7 +170,7 @@
                           fill_value=data.iloc[:,-1:])
    ```

-4.  Print the input variable to verify its contents, as follows:
+4.  打印输入变量以验证其内容，如下图所示。

    ```py
    inputs
@@ -182,7 +182,7 @@

    图 6.11：输入变量

-5.  Print the resulting target variable using the following code:
+5.  使用下面的代码打印出目标变量。

    ```py
    targets
@@ -261,7 +261,7 @@ for i in range(1, epochs+1):
 1.  导入所需的库。
 2.  加载数据集并对其进行切片，以使其包含所有行，但仅包含索引 1 至 52 的列。
 3.  从整个数据集中绘制五个随机选择的产品的每周销售交易图。 进行随机采样时，请使用`0`的随机种子，以获得与当前活动相同的结果。
-4.  Create the **inputs** and **targets** variables, which will be fed to the network to create the model. These variables should be of the same shape and be converted into PyTorch tensors.
+4.  创建`inputs`和`targets`变量，这些变量将被输入到网络中以创建模型。这些变量应具有相同的形状，并转换为 PyTorch 张量。

    `input`变量应包含除上周外所有周所有产品的数据，因为模型的目的是预测最后一周。

@@ -272,7 +272,7 @@ for i in range(1, epochs+1):
 7.  定义损失函数，优化算法和训练网络的时期数。 为此，请使用均方误差损失函数，Adam 优化器和 10,000 个纪元。
 8.  使用`for`循环通过遍历所有时期来执行训练过程。 在每个时期，都必须进行预测，以及随后的损失函数计算和网络参数优化。 然后，保存每个时期的损失。
 9.  绘制所有时期的损失。
-10.  Using a scatter plot, display the predictions that were obtained in the last epoch of the training process against the ground truth values (that is, the sales transactions of the last week).
+10.  使用散点图，显示在训练过程的最后一个纪元中获得的预测值与地面真实值（即上周的销售交易）的对比。

    注意

@@ -336,7 +336,7 @@ for i in range(1, epochs+1):

 此处，`L[t]`表示来自学习门的输出，而`F[t]`表示来自遗忘门的输出。

-*   **Use gate**: This is also known as the output gate. Here, the information from both the learn and forget gates are joined together in the use gate. This gate makes use of all the relevant information to perform a prediction, which also becomes the new short-term memory.
+*   **使用门**。这也称为输出门。在这里，来自学习门和遗忘门的信息被合并到使用门中。该门利用所有相关信息进行预测，也成为新的短期记忆。

    这可以通过三个步骤实现。 首先，它将线性函数和激活函数（*tanh*）应用于遗忘门的输出。 其次，它将线性函数和激活函数（*Sigmoid*）应用于短期记忆和当前事件。 第三，它将前面步骤的输出相乘。 第三步的输出将是新的短期记忆和当前步的预测：

@@ -456,19 +456,19 @@ onehot = onehot_flat.reshape((batch.shape[0],\

 在本练习中，您将预处理文本片段，然后将其转换为一键式矩阵。 请按照以下步骤完成此练习：

-1.  Import NumPy:
+1.  导入NumPy。

    ```py
    import numpy as np
    ```

-2.  Create a variable named **text**, which will contain the text sample **"Hello World!"**:
+2.  创建一个名为`text`的变量，其中将包含文本样本`"Hello World!"`。

    ```py
    text = "Hello World!"
    ```

-3.  Create a dictionary by mapping each letter to a number:
+3.  通过将每个字母映射到一个数字来创建一个字典。

    ```py
    chars = list(set(text))
@@ -483,7 +483,7 @@ onehot = onehot_flat.reshape((batch.shape[0],\
    {'d': 0, 'o': 1, 'H': 2, ' ': 3, 'e': 4, 'W': 5, '!': 6, 'l': 7, 'r': 8}
    ```

-4.  Encode your text sample with the numbers we defined in the previous step:
+4.  用我们在上一步中定义的数字对你的文本样本进行编码。

    ```py
    encoded = []
@@ -491,7 +491,7 @@ onehot = onehot_flat.reshape((batch.shape[0],\
        encoded.append(indexer[c])
    ```

-5.  Convert the encoded variable into a NumPy array and reshape it so that the sentence is divided into two sequences of the same size:
+5.  将编码变量转换为NumPy数组，并对其进行重塑，使句子被分成两个大小相同的序列。

    ```py
    encoded = np.array(encoded).reshape(2,-1)
@@ -505,7 +505,7 @@ onehot = onehot_flat.reshape((batch.shape[0],\
           [5, 1, 8, 7, 0, 6]])
    ```

-6.  Define a function that takes an array of numbers and creates a one-hot matrix:
+6.  定义一个函数，接收一个数字数组，并创建一个单热矩阵。

    ```py
    def index2onehot(batch):
@@ -519,7 +519,7 @@ onehot = onehot_flat.reshape((batch.shape[0],\
        return onehot
    ```

-7.  Convert the encoded array into a one-hot matrix by passing it through the previously defined function:
+7.  通过之前定义的函数将编码数组转换为单热矩阵。

    ```py
    one_hot = index2onehot(encoded)
@@ -688,20 +688,20 @@ while starter[-1] != "." and counter < 50:
 6.  创建一个定义网络架构的类。 该类应包含一个用于初始化 LSTM 层状态的附加函数。
 7.  请确定要从数据集中创建的批量数量，请记住每个批量应包含 100 个序列，每个批量的长度应为 50。接下来，将编码数据拆分为 100 个序列。
 8.  使用 256 作为隐藏单位数（总共两个循环层）实例化模型。
-9.  Define the loss function and the optimization algorithms. Use the Adam optimizer and the cross-entropy loss. Train the network for 20 epochs.
+9.  定义损失函数和优化算法。使用Adam优化器和交叉熵损失。训练网络20个纪元。

    注意

    根据您的资源，训练过程将花费很长时间，这就是为什么建议仅运行 20 个纪元的原因。 但是，本书的 GitHub 存储库中提供了可以在 GPU 上运行的代码的等效版本。 这将使您运行更多的时代并获得出色的性能。

-10.  In each epoch, the data must be divided into batches with a sequence length of 50\. This means that each epoch will have 100 sequences, each with a length of 50.
+10.  在每一个时代，数据必须被划分为50个序列长度的批次。这意味着每个时代将有100个序列，每个序列的长度为50。

    注意

    为输入和目标创建了批量 ，其中后者是前者的副本，但领先一步。

 11.  绘制随时间推移的损失进度。
-12.  Feed the following sentence starter into the trained model and complete the sentence: "So she was considering in her own mind ".
+12.  将下面的句子启动器输入到训练好的模型中，并完成这个句子：`"So she was considering in her own mind "`。

    注意

@@ -758,7 +758,7 @@ NLP 是**人工智能**（**AI**）的子字段，它通过使计算机能够理

 与其他任何数据问题一样，您需要将数据加载到代码中，同时要记住对不同的数据类型使用不同的方法。 除了将整个单词集转换为小写字母之外，数据还经过一些基本的转换，可让您将数据输入网络。 最常见的转换如下：

-*   **Eliminating punctuation**: When processing text data word by word for NLP purposes, remove any punctuation. This is done to avoid taking the same word as two separate words because one of them is followed by a period, comma, or any other special character. Once this has been achieved, it is possible to define a list containing the vocabulary (that is, the entire set of words) of the input text.
+*   **消除标点符号**。在为NLP目的逐字处理文本数据时，删除任何标点符号。这样做是为了避免把同一个词当作两个独立的词，因为其中一个词后面有句号、逗号或任何其他特殊字符。一旦实现了这一点，就可以定义一个包含输入文本词汇的列表（也就是整个词集）。

    这可以通过使用`string`模块的`punctuation`预初始化的字符串来完成，该字符串提供了可用于在文本序列中标识它们的标点符号列表，如 以下代码段：

@@ -828,7 +828,7 @@ class LSTM(nn.Module):
 3.  从评论中删除标点符号。
 4.  创建一个包含整个评论集的词汇表的变量。 此外，创建一个字典，将每个单词映射到一个整数，其中单词将作为键，而整数将是值。
 5.  通过将评论中的每个单词替换为其成对的整数来对评论数据进行编码。
-6.  Create a class containing the architecture of the network. Make sure that you include an embedding layer.
+6.  创建一个包含网络架构的类。确保你包含一个嵌入层。

    注意