introduction_en.ipynb 4.1 KB
Notebook
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "46995787",
   "metadata": {},
   "source": [
    "# GPT-2 Medium\n",
    "\n",
    "You can get more details from [GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f695ad73",
   "metadata": {},
   "source": [
    "## Model Details\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5a8170d9",
   "metadata": {},
   "source": [
    "**Model Description:** GPT-2 Medium is the **355M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1d0dc244",
   "metadata": {},
   "source": [
    "- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n",
    "- **Model Type:** Transformer-based language model\n",
    "- **Language(s):** English\n",
    "- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n",
    "- **Related Models:** [GPT2](https://huggingface.co/gpt2), [GPT2-Large](https://huggingface.co/gpt2-large) and [GPT2-XL](https://huggingface.co/gpt2-xl)\n",
    "- **Resources for more information:**\n",
    "- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
    "- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n",
    "- [GitHub Repo](https://github.com/openai/gpt-2)\n",
    "- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n",
    "- Test the full generation capabilities here: https://transformer.huggingface.co/doc/gpt2-large\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "adc5a3f9",
   "metadata": {},
   "source": [
    "## How to Get Started with the Model\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7566eafd",
   "metadata": {},
   "source": [
    "Use the code below to get started with the model. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ab4c71ee",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install --upgrade paddlenlp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b0167528",
   "metadata": {},
   "outputs": [],
   "source": [
    "import paddle\n",
    "from paddlenlp.transformers import AutoModel\n",
    "\n",
    "model = AutoModel.from_pretrained(\"gpt2-medium\")\n",
    "input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
    "print(model(input_ids))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "52cdcf9e",
   "metadata": {},
   "source": [
    "## Citation\n",
    "\n",
    "```\n",
    "@article{radford2019language,\n",
    "title={Language models are unsupervised multitask learners},\n",
    "author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n",
    "journal={OpenAI blog},\n",
    "volume={1},\n",
    "number={8},\n",
    "pages={9},\n",
    "year={2019}\n",
    "}\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eb327c10",
   "metadata": {},
   "source": [
    "## Model Card Authors\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "50fb7de8",
   "metadata": {},
   "source": [
    "> The model introduction and model weights originate from https://huggingface.co/gpt2-medium and were converted to PaddlePaddle format for ease of use in PaddleNLP."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}