{ "cells": [ { "cell_type": "markdown", "id": "25324e9c", "metadata": {}, "source": [ "# GPT-2 Medium\n", "\n", "详细内容请看[GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)。" ] }, { "cell_type": "markdown", "id": "806177e3", "metadata": {}, "source": [ "## Model Details\n" ] }, { "cell_type": "markdown", "id": "dbcaecb0", "metadata": {}, "source": [ "**Model Description:** GPT-2 Medium is the **355M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n" ] }, { "cell_type": "markdown", "id": "ab73e9f0", "metadata": {}, "source": [ "- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n", "- **Model Type:** Transformer-based language model\n", "- **Language(s):** English\n", "- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n", "- **Related Models:** [GPT2](https://huggingface.co/gpt2), [GPT2-Large](https://huggingface.co/gpt2-large) and [GPT2-XL](https://huggingface.co/gpt2-xl)\n", "- **Resources for more information:**\n", "- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n", "- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n", "- [GitHub Repo](https://github.com/openai/gpt-2)\n", "- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n", "- Test the full generation capabilities here: https://transformer.huggingface.co/doc/gpt2-large\n" ] }, { "cell_type": "markdown", "id": "70c3fd36", "metadata": {}, "source": [ "## How to use" ] }, { "cell_type": "code", "execution_count": null, "id": "1bae5ee0", "metadata": {}, "outputs": [], "source": [ "!pip install --upgrade paddlenlp" ] }, { "cell_type": "code", "execution_count": null, "id": "11b32577", "metadata": {}, "outputs": [], "source": [ "import paddle\n", "from paddlenlp.transformers import AutoModel\n", "\n", "model = AutoModel.from_pretrained(\"gpt2-medium\")\n", "input_ids = paddle.randint(100, 200, shape=[1, 20])\n", "print(model(input_ids))" ] }, { "cell_type": "markdown", "id": "08f90ea0", "metadata": {}, "source": [ "## Citation\n", "\n", "```\n", "@article{radford2019language,\n", "title={Language models are unsupervised multitask learners},\n", "author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n", "journal={OpenAI blog},\n", "volume={1},\n", "number={8},\n", "pages={9},\n", "year={2019}\n", "}\n", "```" ] }, { "cell_type": "markdown", "id": "64d79312", "metadata": {}, "source": [ "## Model Card Authors\n" ] }, { "cell_type": "markdown", "id": "d14dd2ac", "metadata": {}, "source": [ "This model card was written by the Hugging Face team.\n", "\n", "> 此模型介绍及权重来源于 https://huggingface.co/gpt2-medium ,并转换为飞桨模型格式。" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.13" } }, "nbformat": 4, "nbformat_minor": 5 }