{ "cells": [ { "cell_type": "markdown", "id": "4260a150", "metadata": {}, "source": [ "# Model Card for DistilBERT base multilingual (cased)\n", "\n", "You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。" ] }, { "cell_type": "markdown", "id": "53f1b1c2", "metadata": {}, "source": [ "This model is a distilled version of the [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased/). The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation). This model is cased: it does make a difference between english and English.\n" ] }, { "cell_type": "markdown", "id": "f417583b", "metadata": {}, "source": [ "The model is trained on the concatenation of Wikipedia in 104 different languages listed [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages).\n", "The model has 6 layers, 768 dimension and 12 heads, totalizing 134M parameters (compared to 177M parameters for mBERT-base).\n", "On average, this model, referred to as DistilmBERT, is twice as fast as mBERT-base.\n", "\n", "- **Developed by:** Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf (Hugging Face)\n", "- **Model type:** Transformer-based language model\n", "- **Language(s) (NLP):** 104 languages; see full list [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages)\n", "- **License:** Apache 2.0\n", "- **Related Models:** [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased)\n", "- **Resources for more information:**\n", "- [GitHub Repository](https://github.com/huggingface/transformers/blob/main/examples/research_projects/distillation/README.md)\n", "- [Associated Paper](https://arxiv.org/abs/1910.01108)\n" ] }, { "cell_type": "markdown", "id": "f47ce9b7", "metadata": {}, "source": [ "## How to use" ] }, { "cell_type": "code", "execution_count": null, "id": "a1353b5f", "metadata": {}, "outputs": [], "source": [ "!pip install --upgrade paddlenlp" ] }, { "cell_type": "code", "execution_count": null, "id": "e23a860f", "metadata": {}, "outputs": [], "source": [ "import paddle\n", "from paddlenlp.transformers import AutoModel\n", "\n", "model = AutoModel.from_pretrained(\"distilbert-base-multilingual-cased\")\n", "input_ids = paddle.randint(100, 200, shape=[1, 20])\n", "print(model(input_ids))" ] }, { "cell_type": "markdown", "id": "38c30ea4", "metadata": {}, "source": [ "# Citation\n", "\n", "```\n", "@article{Sanh2019DistilBERTAD,\n", " title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},\n", " author={Victor Sanh and Lysandre Debut and Julien Chaumond and Thomas Wolf},\n", " journal={ArXiv},\n", " year={2019},\n", " volume={abs/1910.01108}\n", "}\n", "```\n" ] }, { "cell_type": "markdown", "id": "0ee03d6a", "metadata": {}, "source": [ "> The model introduction and model weights originate from https://huggingface.co/distilbert-base-multilingual-cased and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.13" } }, "nbformat": 4, "nbformat_minor": 5 }