8. Train the network using the training set's data. Use the validation sets to measure performance. To do this, save the loss and the accuracy for both the training and validation sets in each epoch.
注意
...
...
@@ -722,7 +722,7 @@ batch_size = 100
该练习不需要进行任何编码,而是需要对先前活动的结果进行分析。
1. Assuming a Bayes error of**0.15**, perform error analysis and diagnose the model:
1. Assuming a Bayes error of`0.15`, perform error analysis and diagnose the model:
8. Make a line plot to display the loss value for each iteration step:
...
...
@@ -428,7 +428,7 @@
返回
7. Instantiate the model and define all the variables required to train the model. Set the number of epochs to**50** and the batch size to **128**. Use a learning rate of **0.001**:
7. Instantiate the model and define all the variables required to train the model. Set the number of epochs to`50`and the batch size to`128`. Use a learning rate of`0.001`:
7. Define all of the parameters that are required to train your model. Set the number of epochs to**50**:
7. Define all of the parameters that are required to train your model. Set the number of epochs to`50`:
型号= CNN()
...
...
@@ -1496,7 +1496,7 @@
2. Change the definition of the **transform** variable so that it includes, in addition to normalizing and converting the data into tensors, the following transformations:
6. Instantiate the **class** function containing the model. Feed the input size, the number of neurons in each recurrent layer (**10**), and the number of recurrent layers (`1`):
6. Instantiate the **class** function containing the model. Feed the input size, the number of neurons in each recurrent layer (`10`), and the number of recurrent layers (`1`):
模型= RNN(data_train.shape [1],10,1)
...
...
@@ -2346,7 +2346,7 @@
x = np.array(x).reshape(([ n_seq,-1))
8. Instantiate your model by using**256**as the number of hidden units for a total of two recurrent layers:
8. Instantiate your model by using`256`as the number of hidden units for a total of two recurrent layers:
模型= LSTM(仅(字符),256、2)
...
...
@@ -2366,7 +2366,7 @@
模型= LSTM(len(chars),256,2).to(“ cuda”)
9. Define the loss function and the optimization algorithms. Use the Adam optimizer and the cross-entropy loss to do this. Train the network for**20**epochs:
9. Define the loss function and the optimization algorithms. Use the Adam optimizer and the cross-entropy loss to do this. Train the network for`20`epochs:
5. We then define the length of our embeddings. While this can technically be any number you wish, there are some tradeoffs to consider. While higher-dimensional embeddings can lead to a more detailed representation of the words, the feature space also becomes sparser, which means high-dimensional embeddings are only appropriate for large corpuses. Furthermore, larger embeddings mean more parameters to learn, so increasing the embedding size can increase training time significantly. We are only training on a very small dataset, so we have opted to use embeddings of size**20**:
5. We then define the length of our embeddings. While this can technically be any number you wish, there are some tradeoffs to consider. While higher-dimensional embeddings can lead to a more detailed representation of the words, the feature space also becomes sparser, which means high-dimensional embeddings are only appropriate for large corpuses. Furthermore, larger embeddings mean more parameters to learn, so increasing the embedding size can increase training time significantly. We are only training on a very small dataset, so we have opted to use embeddings of size`20`:
6. We define our forward pass by obtaining and summing the embeddings for all input context words. This then passes through the fully connected layer with ReLU activation functions and finally into the classification layer, which predicts which word in the corpus corresponds to the summed embeddings of the context words the most:
5. To define our output, we return a JSON response consisting of the output from our model and a response code,**200**, which is what is returned by our predict function:
5. To define our output, we return a JSON response consisting of the output from our model and a response code,`200`, which is what is returned by our predict function: