{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction to Intent Classification\n", "\n", "*Copyright (c) 2023 Institute for Quantum Computing, Baidu Inc. All Rights Reserved.*\n", "\n", "Natural language processing (NLP) is a machine learning technology that gives computers the ability to interpret, manipulate, and comprehend human language. Organizations today have large volumes of voice and text data from various communication channels like emails, text messages, social media newsfeeds, video, audio, and more. They use NLP software to automatically process this data, analyze the intent or sentiment in the message, and respond in real time to human communication.\n", "\n", "Intent classification is one of the fundamental tasks in NLP and has important applications in products such as search engines, intelligent customer service, and robotics.\n", "\n", "Here, we use BERT-QTC [1], a quantum-classical hybrid model, to implement the intent recognition task. Here, our intention recognition task is to determine, for the input text, the intention corresponding to this sentence, such as chatting, asking for a recipe, asking for a TV channel, etc.\n", "\n", "We used the [SMP2017 dataset](https://github.com/HITlilingzhi/SMP2017ECDT-DATA) [2] for our experiment. We select four of the classes for training, which are train, music, weather, message, telephone, flight, and news. A sample of the data is as follows:\n", "\n", "- 查宁波到北京的火车票\n", "- 我想知道浙江义乌的天气\n", "- 帮我查一下明天广州到长沙的航班\n", "- 我想听最新军事新闻。\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Intent Classification Using BERT-QTC Model\n", "\n", "The BERT-QTC model is a quantum-classical hybrid model. It has the following model structure:\n", "\n", "![the arch of the bert-qtc model](bert_qtc_arch.png)\n", "\n", "The workflow of the model is as follows:\n", "\n", "1. Use BERT [3] to extract the feature of the input text to obtain a sentence-level feature representation.\n", "2. For the features extracted by BERT, use Quantum Temporal Convolution (QTC) and Global Maxing Pooling (GMP) for further feature extraction and dimensionality reduction.\n", "3. Use the fully connected layer to perform classification prediction and obtain classification results.\n", "\n", "### Workflow\n", "\n", "BERT-QTC is a learning model. Thus, We need to use the dataset to train the model first. After the training converges, we get a trained model that can classify this type of data. Among them, because the BERT model is a large language model. Therefore, we use pre-trained models for feature extraction. In the subsequent model training process, the parameters of the BERT part are no longer trained.\n", "\n", "In summary, its workflow is as follows:\n", "\n", "1. Prepare a dataset for intent classification.\n", "2. Use the dataset to train the BERT-QTC model to get the trained model.\n", "3. Use the model to predict the input text and get the prediction result." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## How To Use\n", "\n", "Here, we give a trained model for testing. You only need to make corresponding configurations in the `example.toml` configuration file. Then enter `python intent_classification.py --config example.toml` to test the entered text.\n", "\n", "### Online Demo\n", "\n", "Here, we give a version of the online demo. First define the contents of the configuration file. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "test_toml = r\"\"\"\n", "task = 'test'\n", "text = '查宁波到北京的火车票'\n", "num_filter = 1\n", "kernel_size = 5\n", "circuit_depth = 2\n", "padding = 2\n", "model_path = 'decoder.pdparams'\n", "bert_model = 'bert-base-chinese'\n", "hidden_size = 768\n", "classes = ['火车', '音乐', '天气', '短信', '电话', '航班', '新闻']\n", "\"\"\"" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Next is the code for the prediction section." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The input text is 查宁波到北京的火车票.\n", "The prediction of the model is 火车.\n" ] } ], "source": [ "import os\n", "import warnings\n", "\n", "warnings.filterwarnings('ignore')\n", "os.environ['PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION'] = 'python'\n", "\n", "import toml\n", "from paddle_quantum.qml.bert_qtc import train, inference\n", "\n", "config = toml.loads(test_toml)\n", "task = config.pop('task')\n", "if task == 'train':\n", " train(**config)\n", "elif task == 'test':\n", " prediction = inference(**config)\n", " text = config['text']\n", " print(f'The input text is {text}.')\n", " print(f'The prediction of the model is {prediction}.')\n", "else:\n", " raise ValueError(\"Unknown task, it can be train or test.\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Here, we only need to modify the content of the text in the configuration file, and then run the entire code to quickly test other texts." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Note\n", "\n", "Here, we provide only a demo model, and its effect is only for demonstration. For practical application scenarios, targeted design and training are needed to achieve better results.\n", "\n", "### The structure of the dataset\n", "\n", "If you want to use a custom dataset for training, you just need to prepare the dataset according to the rules. Prepare `train.txt` and `test.txt` in the dataset folder, and `dev.txt` if a validation set is needed. One line is used to represent one piece of data in each file. Each line contains text and a corresponding label, separated by tabs. Text is composed of space-separated words.\n", "\n", "### Introduction to the Configuration File\n", "\n", "In `test.toml`, there is a complete reference to the configuration files needed for testing. In `train.toml`, there is a complete reference to the configuration files needed for training. Use `python intent_classification --config train.toml` to train the model. Use `python intent_classification --config test.toml` to load the trained model for testing.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## References\n", "\n", "[1] Yang C H H, Qi J, Chen S Y C, et al. When BERT meets quantum temporal convolution learning for text classification in heterogeneous computing[C]//ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022: 8602-8606.\n", "\n", "[2] Zhang W N, Chen Z, Che W, et al. The first evaluation of Chinese human-computer dialogue technology[J]. arXiv preprint arXiv:1709.10217, 2017.\n", "\n", "[3] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019: 4171-4186." ] } ], "metadata": { "kernelspec": { "display_name": "temp", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.15" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }