introduction_en.ipynb 8.2 KB
Notebook
Newer Older
Q
Quleaf 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196
{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Introduction to Intent Classification\n",
    "\n",
    "*Copyright (c) 2023 Institute for Quantum Computing, Baidu Inc. All Rights Reserved.*\n",
    "\n",
    "Natural language processing (NLP) is a machine learning technology that gives computers the ability to interpret, manipulate, and comprehend human language. Organizations today have large volumes of voice and text data from various communication channels like emails, text messages, social media newsfeeds, video, audio, and more. They use NLP software to automatically process this data, analyze the intent or sentiment in the message, and respond in real time to human communication.\n",
    "\n",
    "Intent classification is one of the fundamental tasks in NLP and has important applications in products such as search engines, intelligent customer service, and robotics.\n",
    "\n",
    "Here, we use BERT-QTC [1], a quantum-classical hybrid model, to implement the intent recognition task. Here, our intention recognition task is to determine, for the input text, the intention corresponding to this sentence, such as chatting, asking for a recipe, asking for a TV channel, etc.\n",
    "\n",
    "We used the [SMP2017 dataset](https://github.com/HITlilingzhi/SMP2017ECDT-DATA) [2] for our experiment. We select four of the classes for training, which are train, music, weather, message, telephone, flight, and news. A sample of the data is as follows:\n",
    "\n",
    "- 查宁波到北京的火车票\n",
    "- 我想知道浙江义乌的天气\n",
    "- 帮我查一下明天广州到长沙的航班\n",
    "- 我想听最新军事新闻。\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Intent Classification Using BERT-QTC Model\n",
    "\n",
    "The BERT-QTC model is a quantum-classical hybrid model. It has the following model structure:\n",
    "\n",
    "![the arch of the bert-qtc model](bert_qtc_arch.png)\n",
    "\n",
    "The workflow of the model is as follows:\n",
    "\n",
    "1. Use BERT [3] to extract the feature of the input text to obtain a sentence-level feature representation.\n",
    "2. For the features extracted by BERT, use Quantum Temporal Convolution (QTC) and Global Maxing Pooling (GMP) for further feature extraction and dimensionality reduction.\n",
    "3. Use the fully connected layer to perform classification prediction and obtain classification results.\n",
    "\n",
    "### Workflow\n",
    "\n",
    "BERT-QTC is a learning model. Thus, We need to use the dataset to train the model first. After the training converges, we get a trained model that can classify this type of data. Among them, because the BERT model is a large language model. Therefore, we use pre-trained models for feature extraction. In the subsequent model training process, the parameters of the BERT part are no longer trained.\n",
    "\n",
    "In summary, its workflow is as follows:\n",
    "\n",
    "1. Prepare a dataset for intent classification.\n",
    "2. Use the dataset to train the BERT-QTC model to get the trained model.\n",
    "3. Use the model to predict the input text and get the prediction result."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## How To Use\n",
    "\n",
    "Here, we give a trained model for testing. You only need to make corresponding configurations in the `example.toml` configuration file. Then enter `python intent_classification.py --config example.toml` to test the entered text.\n",
    "\n",
    "### Online Demo\n",
    "\n",
    "Here, we give a version of the online demo. First define the contents of the configuration file. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_toml = r\"\"\"\n",
    "task = 'test'\n",
    "text = '查宁波到北京的火车票'\n",
    "num_filter = 1\n",
    "kernel_size = 5\n",
    "circuit_depth = 2\n",
    "padding = 2\n",
    "model_path = 'decoder.pdparams'\n",
    "bert_model = 'bert-base-chinese'\n",
    "hidden_size = 768\n",
    "classes = ['火车', '音乐', '天气', '短信', '电话', '航班', '新闻']\n",
    "\"\"\""
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next is the code for the prediction section."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The input text is 查宁波到北京的火车票.\n",
      "The prediction of the model is 火车.\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "import warnings\n",
    "\n",
    "warnings.filterwarnings('ignore')\n",
    "os.environ['PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION'] = 'python'\n",
    "\n",
    "import toml\n",
    "from paddle_quantum.qml.bert_qtc import train, inference\n",
    "\n",
    "config = toml.loads(test_toml)\n",
    "task = config.pop('task')\n",
    "if task == 'train':\n",
    "    train(**config)\n",
    "elif task == 'test':\n",
    "    prediction = inference(**config)\n",
    "    text = config['text']\n",
    "    print(f'The input text is {text}.')\n",
    "    print(f'The prediction of the model is {prediction}.')\n",
    "else:\n",
    "    raise ValueError(\"Unknown task, it can be train or test.\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, we only need to modify the content of the text in the configuration file, and then run the entire code to quickly test other texts."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Note\n",
    "\n",
    "Here, we provide only a demo model, and its effect is only for demonstration. For practical application scenarios, targeted design and training are needed to achieve better results.\n",
    "\n",
    "### The structure of the dataset\n",
    "\n",
    "If you want to use a custom dataset for training, you just need to prepare the dataset according to the rules. Prepare `train.txt` and `test.txt` in the dataset folder, and `dev.txt` if a validation set is needed. One line is used to represent one piece of data in each file. Each line contains text and a corresponding label, separated by tabs. Text is composed of space-separated words.\n",
    "\n",
    "### Introduction to the Configuration File\n",
    "\n",
    "In `test.toml`, there is a complete reference to the configuration files needed for testing. In `train.toml`, there is a complete reference to the configuration files needed for training. Use `python intent_classification --config train.toml` to train the model. Use `python intent_classification --config test.toml` to load the trained model for testing.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## References\n",
    "\n",
    "[1] Yang C H H, Qi J, Chen S Y C, et al. When BERT meets quantum temporal convolution learning for text classification in heterogeneous computing[C]//ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022: 8602-8606.\n",
    "\n",
    "[2] Zhang W N, Chen Z, Che W, et al. The first evaluation of Chinese human-computer dialogue technology[J]. arXiv preprint arXiv:1709.10217, 2017.\n",
    "\n",
    "[3] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019: 4171-4186."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "temp",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.15"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}