{ "cells": [ { "cell_type": "markdown", "id": "85c2e1a7", "metadata": {}, "source": [ "# GerPT2\n" ] }, { "cell_type": "markdown", "id": "595fe7cb", "metadata": {}, "source": [ "See the GPT2 model card for considerations on limitations and bias. See the GPT2 documentation for details on GPT2.\n" ] }, { "cell_type": "markdown", "id": "5b4f950b", "metadata": {}, "source": [ "## Comparison to dbmdz/german-gpt2\n" ] }, { "cell_type": "markdown", "id": "95be6eb8", "metadata": {}, "source": [ "I evaluated both GerPT2-large and the other German GPT2, dbmdz/german-gpt2 on the [CC-100](http://data.statmt.org/cc-100/) dataset and on the German Wikipedia:\n" ] }, { "cell_type": "markdown", "id": "8acd14be", "metadata": {}, "source": [ "| | CC-100 (PPL) | Wikipedia (PPL) |\n", "|-------------------|--------------|-----------------|\n", "| dbmdz/german-gpt2 | 49.47 | 62.92 |\n", "| GerPT2 | 24.78 | 35.33 |\n", "| GerPT2-large | __16.08__ | __23.26__ |\n", "| | | |\n" ] }, { "cell_type": "markdown", "id": "6fa10d79", "metadata": {}, "source": [ "See the script `evaluate.py` in the [GerPT2 Github repository](https://github.com/bminixhofer/gerpt2) for the code.\n" ] }, { "cell_type": "markdown", "id": "a8514e1e", "metadata": {}, "source": [ "## Usage\n" ] }, { "cell_type": "code", "execution_count": null, "id": "4bc62c63", "metadata": {}, "outputs": [], "source": [ "!pip install --upgrade paddlenlp" ] }, { "cell_type": "code", "execution_count": null, "id": "63f78302", "metadata": {}, "outputs": [], "source": [ "import paddle\n", "from paddlenlp.transformers import AutoModel\n", "\n", "model = AutoModel.from_pretrained(\"benjamin/gerpt2-large\")\n", "input_ids = paddle.randint(100, 200, shape=[1, 20])\n", "print(model(input_ids))" ] }, { "cell_type": "markdown", "id": "563152f3", "metadata": {}, "source": [ "```\n", "@misc{Minixhofer_GerPT2_German_large_2020,\n", "author = {Minixhofer, Benjamin},\n", "doi = {10.5281/zenodo.5509984},\n", "month = {12},\n", "title = {{GerPT2: German large and small versions of GPT2}},\n", "url = {https://github.com/bminixhofer/gerpt2},\n", "year = {2020}\n", "}\n", "```" ] }, { "cell_type": "markdown", "id": "b0d67d21", "metadata": {}, "source": [ "## Acknowledgements\n" ] }, { "cell_type": "markdown", "id": "474c1c61", "metadata": {}, "source": [ "Thanks to [Hugging Face](https://huggingface.co) for awesome tools and infrastructure.\n", "Huge thanks to [Artus Krohn-Grimberghe](https://twitter.com/artuskg) at [LYTiQ](https://www.lytiq.de/) for making this possible by sponsoring the resources used for training.\n", "\n", "> The model introduction and model weights originate from [https://huggingface.co/benjamin/gerpt2-large](https://huggingface.co/benjamin/gerpt2-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.13" } }, "nbformat": 4, "nbformat_minor": 5 }