{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 量子费舍信息\n", "\n", " Copyright (c) 2021 Institute for Quantum Computing, Baidu Inc. All Rights Reserved. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 概览\n", "\n", "本教程简要介绍经典费舍信息(classical Fisher information, CFI)和量子费舍信息(quantum Fisher information, QFI)的概念及其在量子机器学习中的应用,并展示如何调用量桨来计算它们。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 背景\n", "\n", "量子费舍信息这一概念源自量子传感领域,现已逐渐成为研究参数化量子系统的通用工具 [[1]](https://arxiv.org/abs/2103.15191),例如描述过参数化现象 [[2]](https://arxiv.org/abs/2102.01659),量子自然梯度下降 [[3]](https://arxiv.org/abs/1909.02108) 等。量子费舍信息是经典费舍信息在量子系统中的自然类比。经典费舍信息刻画了一个参数化的『概率分布』对其参数变化的灵敏度,而量子费舍信息刻画了一个参数化的『量子态』对其参数变化的灵敏度。\n", "\n", "按照传统的介绍方式,经典费舍信息会作为数理统计中参数估计的一部分内容出现,但对于初学者来说可能是复杂且不直观的。本教程将从几何的角度出发来介绍经典费舍信息,这不仅有助于直观理解,且更容易由此看出其与量子费舍信息之间的联系。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 经典费舍信息\n", "\n", "首先介绍经典费舍信息。对于一个参数化的概率分布 $p(\\boldsymbol{x};\\boldsymbol{\\theta})$,考虑如下问题\n", "\n", "- 一个轻微的参数改变会在多大程度上造成概率分布的改变?\n", "\n", "这是个关于微扰的问题,所以自然地想到做类似泰勒展开的操作。但在此之前,我们需要知道展开哪个函数,即我们需要量化『概率分布的改变』。更正式的说法是,我们需要定义任意两个概率分布之间的『距离』,记为 $d(p(\\boldsymbol{x};\\boldsymbol{\\theta}),p(\\boldsymbol{x};\\boldsymbol{\\theta}'))$,或简记为 $d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}')$。\n", "\n", "一般地,一个合法的距离定义应该是非负的,当且仅当两点重合时为零,即\n", "\n", "$$\n", "\\begin{aligned}\n", "&d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}')\\geq 0,\\\\\n", "&d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}')=0~\\Leftrightarrow~\\boldsymbol{\\theta}=\\boldsymbol{\\theta}'.\n", "\\end{aligned}\n", "\\tag{1}\n", "$$\n", "\n", "考虑一个很短的距离函数的展开 $d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}+\\boldsymbol{\\delta})$,以上条件会导致\n", "\n", "$$\n", "\\begin{aligned}\n", "&d(\\boldsymbol{\\theta},\\boldsymbol{\\theta})=0~\\Rightarrow~\\text{零阶项}=0,\\\\\n", "&d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}+\\boldsymbol{\\delta})\\geq 0~\\Rightarrow~\\boldsymbol{\\delta}=0~\\text{取极小值}\n", "~\\Rightarrow~\\text{一阶项}=0.\n", "\\end{aligned}\n", "\\tag{2}\n", "$$\n", "\n", "因此,在这个展开中最低阶的非零贡献来自二阶。因此它可被写为\n", "\n", "$$\n", "\\begin{aligned}\n", "d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}+\\boldsymbol{\\delta})\n", "=\\frac{1}{2}\\sum_{ij}\\delta_iM_{ij}\\delta_j+O(\\|\\boldsymbol{\\delta}\\|^3) \n", "=\\frac{1}{2} \\boldsymbol{\\delta}^T M \\boldsymbol{\\delta} + O(\\|\\boldsymbol{\\delta}\\|^3),\n", "\\end{aligned}\n", "\\tag{3}\n", "$$\n", "\n", "此处\n", "\n", "$$\n", "M_{ij}(\\boldsymbol{\\theta})=\\left.\\frac{\\partial^2}{\\partial\\delta_i\\partial\\delta_j}d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}+\\boldsymbol{\\delta})\\right|_{\\boldsymbol{\\delta}=0},\n", "\\tag{4}\n", "$$\n", "\n", "正是这个距离函数展开的海森矩阵。在微分几何的框架下,这称作流形的[度规](http://en.wikipedia.org/wiki/Metric_tensor)。以上简单的推导说明,我们总是可以用参数的一个二次型来近似表示一小段距离(如图1)。而二次型的系数矩阵,除了有一个 $1/2$ 因子的差别外,正是距离函数展开的海森矩阵。\n", "\n", "![feature map](./figures/FIM-fig-Sphere-metric.png \"Figure 1. Approximate a small distance on the 2-sphere as a quadratic form\")\n", "