1. We start by loading our **spacy** tokenizers into Python. We will need to do this once for each language we are using since we will be building two entirely separate vocabularies for this task:
2. Next, we create a function for each of our languages to tokenize our sentences. Note that our tokenizer for our input English sentence reverses the order of the tokens:
6. This output consists of a vector of the target vocabulary’s length, with a prediction for each word within the vocabulary. We take the **argmax** function to identify the actual word that is predicted by the model.