{ "cells": [ { "cell_type": "markdown", "id": "46995787", "metadata": {}, "source": [ "# GPT-2 Medium\n", "\n", "You can get more details from [GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)." ] }, { "cell_type": "markdown", "id": "f695ad73", "metadata": {}, "source": [ "## Model Details\n" ] }, { "cell_type": "markdown", "id": "5a8170d9", "metadata": {}, "source": [ "**Model Description:** GPT-2 Medium is the **355M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n" ] }, { "cell_type": "markdown", "id": "1d0dc244", "metadata": {}, "source": [ "- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n", "- **Model Type:** Transformer-based language model\n", "- **Language(s):** English\n", "- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n", "- **Related Models:** [GPT2](https://huggingface.co/gpt2), [GPT2-Large](https://huggingface.co/gpt2-large) and [GPT2-XL](https://huggingface.co/gpt2-xl)\n", "- **Resources for more information:**\n", "- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n", "- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n", "- [GitHub Repo](https://github.com/openai/gpt-2)\n", "- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n", "- Test the full generation capabilities here: https://transformer.huggingface.co/doc/gpt2-large\n" ] }, { "cell_type": "markdown", "id": "adc5a3f9", "metadata": {}, "source": [ "## How to Get Started with the Model\n" ] }, { "cell_type": "markdown", "id": "7566eafd", "metadata": {}, "source": [ "Use the code below to get started with the model. \n" ] }, { "cell_type": "code", "execution_count": null, "id": "ab4c71ee", "metadata": {}, "outputs": [], "source": [ "!pip install --upgrade paddlenlp" ] }, { "cell_type": "code", "execution_count": null, "id": "b0167528", "metadata": {}, "outputs": [], "source": [ "import paddle\n", "from paddlenlp.transformers import AutoModel\n", "\n", "model = AutoModel.from_pretrained(\"gpt2-medium\")\n", "input_ids = paddle.randint(100, 200, shape=[1, 20])\n", "print(model(input_ids))" ] }, { "cell_type": "markdown", "id": "52cdcf9e", "metadata": {}, "source": [ "## Citation\n", "\n", "```\n", "@article{radford2019language,\n", "title={Language models are unsupervised multitask learners},\n", "author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n", "journal={OpenAI blog},\n", "volume={1},\n", "number={8},\n", "pages={9},\n", "year={2019}\n", "}\n", "```" ] }, { "cell_type": "markdown", "id": "eb327c10", "metadata": {}, "source": [ "## Model Card Authors\n" ] }, { "cell_type": "markdown", "id": "50fb7de8", "metadata": {}, "source": [ "> The model introduction and model weights originate from https://huggingface.co/gpt2-medium and were converted to PaddlePaddle format for ease of use in PaddleNLP." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.13" } }, "nbformat": 4, "nbformat_minor": 5 }