2.**Pooling layers**: Although convolutional layers are capable of extracting relevant features from images, their results can become enormous when analyzing complex geometrical shapes, which would make the training process impossible in terms of computational power, hence the invention of pooling layers.
3.**Noisy data/outliers**: Noisy data refers to values that are visibly incorrect, for instance, a person who is 200 years old. On the other hand, outliers refer to values that, although they may be correct, are very far from the mean, for instance, a 10-year-old college student.
8. Train the network using the training set's data. Use the validation sets to measure performance. To do this, save the loss and the accuracy for both the training and validation sets in each epoch.
2. Change the definition of the **transform** variable so that it includes, in addition to normalizing and converting the data into tensors, the following transformations:
2. Next, we build our word index, which is simply a dictionary of all the words in our corpus, and then create a unique index value for each word. This can be easily done with a short **for** loop:
4. Next, we define some utility functions. We first define **make_bow_vector**, which takes the sentence and transforms it into a bag-of-words representation. We first create a vector consisting of all zeros. We then loop through them and for each word in the sentence, we increment the count of that index within the bag-of-words vector by one. We finally reshape this vector using **with .view()** for entry into our classifier:
2. We then simply print the sentence, the true label of the sentence, and then the predicted probabilities. Note that we transform the predicted values from log probabilities back into probabilities. We obtain two probabilities for each prediction, but if we refer back to the label index, we can see that the first probability (index 0) corresponds to Spanish, whereas the other one corresponds to English:
3. Here, we define a function that takes a word as input and returns the weights on each of the parameters within the layer. For a given word, we get the index of this word from our dictionary and then select these parameters from the same index within the model. Note that our model returns two parameters as we are making two predictions; that is, the model's contribution to the Spanish prediction and the model's contribution to the English prediction: