{ "cells": [ { "cell_type": "markdown", "id": "32b8730a", "metadata": {}, "source": [ "# GPT-2 Large\n", "\n", "详细内容请看[GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)。" ] }, { "cell_type": "markdown", "id": "de66cac3", "metadata": {}, "source": [ "## Table of Contents\n", "- [Model Details](#model-details)\n", "- [How To Get Started With the Model](#how-to-get-started-with-the-model)\n", "- [Uses](#uses)\n", "- [Risks, Limitations and Biases](#risks-limitations-and-biases)\n", "- [Training](#training)\n", "- [Evaluation](#evaluation)\n", "- [Environmental Impact](#environmental-impact)\n", "- [Technical Specifications](#technical-specifications)\n", "- [Citation Information](#citation-information)\n", "- [Model Card Authors](#model-card-author)\n" ] }, { "cell_type": "markdown", "id": "8afa58ef", "metadata": {}, "source": [ "## Model Details\n" ] }, { "cell_type": "markdown", "id": "e4e46496", "metadata": {}, "source": [ "**Model Description:** GPT-2 Large is the **774M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n" ] }, { "cell_type": "markdown", "id": "15b8f634", "metadata": {}, "source": [ "- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n", "- **Model Type:** Transformer-based language model\n", "- **Language(s):** English\n", "- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n", "- **Related Models:** [GPT-2](https://huggingface.co/gpt2), [GPT-Medium](https://huggingface.co/gpt2-medium) and [GPT-XL](https://huggingface.co/gpt2-xl)\n", "- **Resources for more information:**\n", "- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n", "- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n", "- [GitHub Repo](https://github.com/openai/gpt-2)\n", "- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n", "- Test the full generation capabilities here: https://transformer.huggingface.co/doc/gpt2-large\n" ] }, { "cell_type": "markdown", "id": "6c2023d9", "metadata": {}, "source": [ "## How to use" ] }, { "cell_type": "code", "execution_count": null, "id": "b17e6efb", "metadata": {}, "outputs": [], "source": [ "!pip install --upgrade paddlenlp" ] }, { "cell_type": "code", "execution_count": null, "id": "33c1f565", "metadata": {}, "outputs": [], "source": [ "import paddle\n", "from paddlenlp.transformers import AutoModel\n", "\n", "model = AutoModel.from_pretrained(\"gpt2-large\")\n", "input_ids = paddle.randint(100, 200, shape=[1, 20])\n", "print(model(input_ids))" ] }, { "cell_type": "markdown", "id": "8060d283", "metadata": {}, "source": [ "## Citatioin\n", "\n", "```\n", "@article{radford2019language,\n", "title={Language models are unsupervised multitask learners},\n", "author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n", "journal={OpenAI blog},\n", "volume={1},\n", "number={8},\n", "pages={9},\n", "year={2019}\n", "}\n", "```" ] }, { "cell_type": "markdown", "id": "083f0d9c", "metadata": {}, "source": [ "## Model Card Authors\n" ] }, { "cell_type": "markdown", "id": "f9e4bb43", "metadata": {}, "source": [ "This model card was written by the Hugging Face team.\n", "\n", "> 此模型介绍及权重来源于[https://huggingface.co/gpt2-large](https://huggingface.co/gpt2-large),并转换为飞桨模型格式。\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.13" } }, "nbformat": 4, "nbformat_minor": 5 }