{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "faeb5f50",
   "metadata": {},
   "source": [
    "## Chinese BERT with Whole Word Masking\n",
    "For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
    "\n",
    "**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
    "Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
    "\n",
    "This repository is developed based on：https://github.com/google-research/bert\n",
    "\n",
    "You may also interested in,\n",
    "- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
    "- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
    "- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
    "- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
    "- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
    "\n",
    "More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fbf98c0e",
   "metadata": {},
   "source": [
    "## How to Use"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5f6b3ac7",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install --upgrade paddlenlp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f380cab7",
   "metadata": {},
   "outputs": [],
   "source": [
    "import paddle\n",
    "from paddlenlp.transformers import AutoModel\n",
    "\n",
    "model = AutoModel.from_pretrained(\"hfl/chinese-bert-wwm-ext\")\n",
    "input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
    "print(model(input_ids))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a39bca7c",
   "metadata": {},
   "source": [
    "\n",
    "## Citation\n",
    "If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
    "- Primary: https://arxiv.org/abs/2004.13922"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5cff4b49",
   "metadata": {},
   "source": [
    "```\n",
    "@inproceedings{cui-etal-2020-revisiting,\n",
    "title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
    "author = \"Cui, Yiming  and\n",
    "Che, Wanxiang  and\n",
    "Liu, Ting  and\n",
    "Qin, Bing  and\n",
    "Wang, Shijin  and\n",
    "Hu, Guoping\",\n",
    "booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
    "month = nov,\n",
    "year = \"2020\",\n",
    "address = \"Online\",\n",
    "publisher = \"Association for Computational Linguistics\",\n",
    "url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
    "pages = \"657--668\",\n",
    "}\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a8781cbe",
   "metadata": {},
   "source": [
    "- Secondary: https://arxiv.org/abs/1906.08101\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b7acc10f",
   "metadata": {},
   "source": [
    "```\n",
    "@article{chinese-bert-wwm,\n",
    "title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
    "author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
    "journal={arXiv preprint arXiv:1906.08101},\n",
    "year={2019}\n",
    "}\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "86de1995",
   "metadata": {},
   "source": [
    "> The model introduction and model weights originate from [https://huggingface.co/hfl/chinese-bert-wwm-ext](https://huggingface.co/hfl/chinese-bert-wwm-ext) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.13"
  },
  "vscode": {
   "interpreter": {
    "hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}