{ "cells": [ { "cell_type": "markdown", "id": "dc26013b", "metadata": {}, "source": [ "# GPT-2 Large\n", "\n", "You can get more details from [GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)." ] }, { "cell_type": "markdown", "id": "38e29e37", "metadata": {}, "source": [ "## Table of Contents\n", "- [Model Details](#model-details)\n", "- [How To Get Started With the Model](#how-to-get-started-with-the-model)\n", "- [Uses](#uses)\n", "- [Risks, Limitations and Biases](#risks-limitations-and-biases)\n", "- [Training](#training)\n", "- [Evaluation](#evaluation)\n", "- [Environmental Impact](#environmental-impact)\n", "- [Technical Specifications](#technical-specifications)\n", "- [Citation Information](#citation-information)\n", "- [Model Card Authors](#model-card-author)\n" ] }, { "cell_type": "markdown", "id": "590c3fbd", "metadata": {}, "source": [ "## Model Details\n" ] }, { "cell_type": "markdown", "id": "1a2cd621", "metadata": {}, "source": [ "**Model Description:** GPT-2 Large is the **774M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n" ] }, { "cell_type": "markdown", "id": "0155f43f", "metadata": {}, "source": [ "- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n", "- **Model Type:** Transformer-based language model\n", "- **Language(s):** English\n", "- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n", "- **Related Models:** https://huggingface.co/gpt2, GPT-Medium and GPT-XL\n", "- **Resources for more information:**\n", "- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n", "- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n", "- [GitHub Repo](https://github.com/openai/gpt-2)\n", "- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n" ] }, { "cell_type": "markdown", "id": "18e2772d", "metadata": {}, "source": [ "## How to use" ] }, { "cell_type": "code", "execution_count": null, "id": "30207821", "metadata": {}, "outputs": [], "source": [ "!pip install --upgrade paddlenlp" ] }, { "cell_type": "code", "execution_count": null, "id": "2ae65fe6", "metadata": {}, "outputs": [], "source": [ "import paddle\n", "from paddlenlp.transformers import AutoModel\n", "\n", "model = AutoModel.from_pretrained(\"gpt2-large\")\n", "input_ids = paddle.randint(100, 200, shape=[1, 20])\n", "print(model(input_ids))" ] }, { "cell_type": "markdown", "id": "e8b7c92b", "metadata": {}, "source": [ "## Citation\n", "\n", "```\n", "@article{radford2019language,\n", "title={Language models are unsupervised multitask learners},\n", "author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n", "journal={OpenAI blog},\n", "volume={1},\n", "number={8},\n", "pages={9},\n", "year={2019}\n", "}\n", "```" ] }, { "cell_type": "markdown", "id": "7cded70d", "metadata": {}, "source": [ "## Model Card Authors\n" ] }, { "cell_type": "markdown", "id": "ff9ab2d4", "metadata": {}, "source": [ "This model card was written by the Hugging Face team.\n", "\n", "> The model introduction and model weights originate from [https://huggingface.co/gpt2-large](https://huggingface.co/gpt2-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.13" } }, "nbformat": 4, "nbformat_minor": 5 }