{ "cells": [ { "cell_type": "markdown", "id": "e42aa4df", "metadata": {}, "source": [ "# GerPT2\n" ] }, { "cell_type": "markdown", "id": "08fd6403", "metadata": {}, "source": [ "See the GPT2 model card for considerations on limitations and bias. See the GPT2 documentation for details on GPT2.\n" ] }, { "cell_type": "markdown", "id": "8295e28d", "metadata": {}, "source": [ "## Comparison to dbmdz/german-gpt2\n" ] }, { "cell_type": "markdown", "id": "c0f50f67", "metadata": {}, "source": [ "I evaluated both GerPT2-large and the other German GPT2, dbmdz/german-gpt2 on the [CC-100](http://data.statmt.org/cc-100/) dataset and on the German Wikipedia:\n" ] }, { "cell_type": "markdown", "id": "6ecdc149", "metadata": {}, "source": [ "| | CC-100 (PPL) | Wikipedia (PPL) |\n", "|-------------------|--------------|-----------------|\n", "| dbmdz/german-gpt2 | 49.47 | 62.92 |\n", "| GerPT2 | 24.78 | 35.33 |\n", "| GerPT2-large | __16.08__ | __23.26__ |\n", "| | | |\n" ] }, { "cell_type": "markdown", "id": "3cddd6a8", "metadata": {}, "source": [ "See the script `evaluate.py` in the [GerPT2 Github repository](https://github.com/bminixhofer/gerpt2) for the code.\n" ] }, { "cell_type": "markdown", "id": "d838da15", "metadata": {}, "source": [ "## Usage\n" ] }, { "cell_type": "code", "execution_count": null, "id": "476bf523", "metadata": {}, "outputs": [], "source": [ "!pip install --upgrade paddlenlp" ] }, { "cell_type": "code", "execution_count": null, "id": "8f509fec", "metadata": {}, "outputs": [], "source": [ "import paddle\n", "from paddlenlp.transformers import AutoModel\n", "\n", "model = AutoModel.from_pretrained(\"benjamin/gerpt2-large\")\n", "input_ids = paddle.randint(100, 200, shape=[1, 20])\n", "print(model(input_ids))" ] }, { "cell_type": "markdown", "id": "d135a538", "metadata": {}, "source": [ "```\n", "@misc{Minixhofer_GerPT2_German_large_2020,\n", "author = {Minixhofer, Benjamin},\n", "doi = {10.5281/zenodo.5509984},\n", "month = {12},\n", "title = {{GerPT2: German large and small versions of GPT2}},\n", "url = {https://github.com/bminixhofer/gerpt2},\n", "year = {2020}\n", "}\n", "```" ] }, { "cell_type": "markdown", "id": "63e09ad7", "metadata": {}, "source": [ "## Acknowledgements\n" ] }, { "cell_type": "markdown", "id": "d9dc51e1", "metadata": {}, "source": [ "Thanks to [Hugging Face](https://huggingface.co) for awesome tools and infrastructure.\n", "Huge thanks to [Artus Krohn-Grimberghe](https://twitter.com/artuskg) at [LYTiQ](https://www.lytiq.de/) for making this possible by sponsoring the resources used for training.\n", "\n", "> 此模型介绍及权重来源于[https://huggingface.co/benjamin/gerpt2-large](https://huggingface.co/benjamin/gerpt2-large),并转换为飞桨模型格式。\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.13" } }, "nbformat": 4, "nbformat_minor": 5 }