{ "cells": [ { "cell_type": "markdown", "id": "4bd898ca", "metadata": {}, "source": [ "# Model Card for DistilRoBERTa base\n" ] }, { "cell_type": "markdown", "id": "7d39a086", "metadata": {}, "source": [ "## Model Description\n" ] }, { "cell_type": "markdown", "id": "e2043d14", "metadata": {}, "source": [ "This model is a distilled version of the [RoBERTa-base model](https://huggingface.co/roberta-base). It follows the same training procedure as [DistilBERT](https://huggingface.co/distilbert-base-uncased).\n", "The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/master/examples/distillation).\n", "This model is case-sensitive: it makes a difference between english and English.\n" ] }, { "cell_type": "markdown", "id": "10aefe84", "metadata": {}, "source": [ "The model has 6 layers, 768 dimension and 12 heads, totalizing 82M parameters (compared to 125M parameters for RoBERTa-base).\n", "On average DistilRoBERTa is twice as fast as Roberta-base.\n" ] }, { "cell_type": "markdown", "id": "d7ebd775", "metadata": {}, "source": [ "We encourage users of this model card to check out the [RoBERTa-base model card](https://huggingface.co/roberta-base) to learn more about usage, limitations and potential biases.\n" ] }, { "cell_type": "markdown", "id": "423d28b1", "metadata": {}, "source": [ "- **Developed by:** Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf (Hugging Face)\n", "- **Model type:** Transformer-based language model\n", "- **Language(s) (NLP):** English\n", "- **License:** Apache 2.0\n", "- **Related Models:** [RoBERTa-base model card](https://huggingface.co/roberta-base)\n", "- **Resources for more information:**\n", "- [GitHub Repository](https://github.com/huggingface/transformers/blob/main/examples/research_projects/distillation/README.md)\n", "- [Associated Paper](https://arxiv.org/abs/1910.01108)\n" ] }, { "cell_type": "markdown", "id": "715b4360", "metadata": {}, "source": [ "## How to use" ] }, { "cell_type": "code", "execution_count": null, "id": "9ad9b1a9", "metadata": {}, "outputs": [], "source": [ "!pip install --upgrade paddlenlp" ] }, { "cell_type": "code", "execution_count": null, "id": "94e4d093", "metadata": {}, "outputs": [], "source": [ "import paddle\n", "from paddlenlp.transformers import AutoModel\n", "\n", "model = AutoModel.from_pretrained(\"distilroberta-base\")\n", "input_ids = paddle.randint(100, 200, shape=[1, 20])\n", "print(model(input_ids))" ] }, { "cell_type": "markdown", "id": "e258a20c", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "> The model introduction and model weights originate from [https://huggingface.co/distilroberta-base](https://huggingface.co/distilroberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.13" } }, "nbformat": 4, "nbformat_minor": 5 }