introduction_en.ipynb 4.5 KB
Notebook
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "dc26013b",
   "metadata": {},
   "source": [
    "# GPT-2 Large\n",
    "\n",
    "You can get more details from [GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38e29e37",
   "metadata": {},
   "source": [
    "## Table of Contents\n",
    "- [Model Details](#model-details)\n",
    "- [How To Get Started With the Model](#how-to-get-started-with-the-model)\n",
    "- [Uses](#uses)\n",
    "- [Risks, Limitations and Biases](#risks-limitations-and-biases)\n",
    "- [Training](#training)\n",
    "- [Evaluation](#evaluation)\n",
    "- [Environmental Impact](#environmental-impact)\n",
    "- [Technical Specifications](#technical-specifications)\n",
    "- [Citation Information](#citation-information)\n",
    "- [Model Card Authors](#model-card-author)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "590c3fbd",
   "metadata": {},
   "source": [
    "## Model Details\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1a2cd621",
   "metadata": {},
   "source": [
    "**Model Description:** GPT-2 Large is the **774M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0155f43f",
   "metadata": {},
   "source": [
    "- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n",
    "- **Model Type:** Transformer-based language model\n",
    "- **Language(s):** English\n",
    "- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n",
    "- **Related Models:** https://huggingface.co/gpt2, GPT-Medium and GPT-XL\n",
    "- **Resources for more information:**\n",
    "- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
    "- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n",
    "- [GitHub Repo](https://github.com/openai/gpt-2)\n",
    "- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "18e2772d",
   "metadata": {},
   "source": [
    "## How to use"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "30207821",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install --upgrade paddlenlp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2ae65fe6",
   "metadata": {},
   "outputs": [],
   "source": [
    "import paddle\n",
    "from paddlenlp.transformers import AutoModel\n",
    "\n",
    "model = AutoModel.from_pretrained(\"gpt2-large\")\n",
    "input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
    "print(model(input_ids))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e8b7c92b",
   "metadata": {},
   "source": [
    "## Citation\n",
    "\n",
    "```\n",
    "@article{radford2019language,\n",
    "title={Language models are unsupervised multitask learners},\n",
    "author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n",
    "journal={OpenAI blog},\n",
    "volume={1},\n",
    "number={8},\n",
    "pages={9},\n",
    "year={2019}\n",
    "}\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7cded70d",
   "metadata": {},
   "source": [
    "## Model Card Authors\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ff9ab2d4",
   "metadata": {},
   "source": [
    "This model card was written by the Hugging Face team.\n",
    "\n",
    "> The model introduction and model weights originate from [https://huggingface.co/gpt2-large](https://huggingface.co/gpt2-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}