{ "cells": [ { "cell_type": "markdown", "id": "adjacent-printing", "metadata": {}, "source": [ "# Variational Shadow Quantum Learning\n", "\n", " Copyright (c) 2021 Institute for Quantum Computing, Baidu Inc. All Rights Reserved. " ] }, { "cell_type": "markdown", "id": "rising-daisy", "metadata": {}, "source": [ "## Overview\n", "\n", "In this tutorial, we will discuss the workflow of Variational Shadow Quantum Learning (VSQL) [1] and accomplish a **binary classification** task using VSQL. VSQL is a hybird quantum-classical framework for supervised quantum learning, which utilizes parameterized quantum circuits and classical shadows. Unlike commonly used variational quantum algorithms, the VSQL method extracts \"local\" features from the subspace instead of the whole Hilbert space.\n", "\n", "### Background\n", "\n", "We consider a $k$-label classification problem. The input is a set containing $N$ labeled data points $D=\\left\\{(\\mathbf{x}^i, \\mathbf{y}^i)\\right\\}_{i=1}^{N}$, where $\\mathbf{x}^i\\in\\mathbb{R}^{m}$ is the data point and $\\mathbf{y}^i$ is a one-hot vector with length $k$ indicating which category the corresponding data point belongs to. Representing the labels as one-hot vectors is a common choice within the machine learning community. For example, for $k=3$, $\\mathbf{y}^{a}=[1, 0, 0]^\\text{T}, \\mathbf{y}^{b}=[0, 1, 0]^\\text{T}$, and $\\mathbf{y}^{c}=[0, 0, 1]^\\text{T}$ indicate that the $a^{\\text{th}}$, the $b^{\\text{th}}$, and the $c^{\\text{th}}$ data points belong to class 0, class 1, and class 2, respectively. **The learning process aims to train a model $\\mathcal{F}$ to predict the label of every data point as accurately as possible.**\n", "\n", "The realization of $\\mathcal{F}$ in VSQL is a combination of parameterized local quantum circuits, known as **shadow circuits**, and a classical fully-connected neural network (FCNN). VSQL requires preprocessing to encode classical information into quantum states. After encoding the data, we convolutionally apply a parameterized local quantum circuit $U(\\mathbf{\\theta})$ to qubits in each encoded quantum state, where $\\mathbf{\\theta}$ is the vector of parameters in the circuit. Then, expectation values are obtained via measuring local observables on these qubits. After the measurement, there is an additional classical FCNN for postprocessing.\n", "\n", "We can write the output of $\\mathcal{F}$, which is obtained from VSQL, as $\\tilde{\\mathbf{y}}^i = \\mathcal{F}(\\mathbf{x}^i)$. Here $\\tilde{\\mathbf{y}}^i$ is a probability distribution, where $\\tilde{y}^i_j$ is the probability of the $i^{\\text{th}}$ data point belonging to the $j^{\\text{th}}$ class. In order to predict the actual label, we calculate the cumulative distance between $\\tilde{\\mathbf{y}}^i$ and $\\mathbf{y}^i$ as the loss function $\\mathcal{L}$ to be optimized:\n", "\n", "$$\n", "\\mathcal{L}(\\mathbf{\\theta}, \\mathbf{W}, \\mathbf{b}) = -\\frac{1}{N}\\sum\\limits_{i=1}^{N}\\sum\\limits_{j=1}^{k}y^i_j\\log{\\tilde{y}^i_j}, \\tag{1}\n", "$$\n", "\n", "where $\\mathbf{W}$ and $\\mathbf{b}$ are the weights and the bias of the one layer FCNN. Note that this loss function is derived from cross-entropy [2].\n", "\n", "### Pipeline\n", "\n", "![pipeline](./figures/vsql-fig-pipeline.png \"Figure 1: Flow chart of VSQL\")\n", "