Improved Lower Bounds for Learning Quantum Channels in Diamond Distance

Aadil Oufkir [email protected] Mohammed VI Polytechnic University, Rocade Rabat-Salé, Technopolis, Morocco Filippo Girardi [email protected] Scuola Normale Superiore, Piazza dei Cavalieri 7, 56126 Pisa, Italy
Abstract

We prove that learning an unknown quantum channel with input dimension dAd_{A}, output dimension dBd_{B}, and Choi rank rr to diamond distance ε\varepsilon requires (dAdBrεlog(dBr/ε))\Omega\!\left(\frac{d_{A}d_{B}r}{\varepsilon\log(d_{B}r/\varepsilon)}\right) queries. This improves the best previous (dAdBr)\Omega(d_{A}d_{B}r) bound by introducing explicit ε\varepsilon-dependence, with a scaling in ε\varepsilon that is near-optimal when dA=rdBd_{A}=rd_{B} but not tight in general. The proof constructs an ensemble of channels that are well-separated in diamond norm yet admit Stinespring isometries that are close in operator norm.

1 Introduction

In [AMele2025, chen2025quantumchanneltomographyestimation] it is proved that there exists a quantum learning algorithm that uses

N=O(dAdBrε2)\displaystyle\hskip 0.0ptN=O\left(\frac{d_{A}d_{B}r}{\varepsilon^{2}}\right) (1)

parallel queries of any unknown channel N\pazocal{N} with input dimension dAd_{A}, output dimension dBd_{B}, and Choi rank rr and, with probability at least 2/32/3, outputs a classical description of a channel Nhat\hat{\pazocal{N}} which is distant at most ε\varepsilon from N\pazocal{N} in diamond distance. Moreover, in [Girardi2025Dec] it is proved that any quantum algorithm learning N\pazocal{N} up to constant error with success probability at least 2/32/3 needs

N=(dAdBr)N=\Omega(d_{A}d_{B}r) (2)

queries of N\pazocal{N} at least. The aim of our work is to improve the lower bound in order to make it dependent on the diamond distance ε\varepsilon. More precisely, our main result (see Theorem 3) identifies the new lower bound

N=(dAdBrεlog(dBr/ε)).\displaystyle\hskip 0.0ptN=\Omega\left(\frac{d_{A}d_{B}r}{\varepsilon\log(d_{B}r/\varepsilon)}\right). (3)

This result, combined with the upper bound of [chen2025quantumchanneltomographyestimation], shows that the optimal dependency in the precision parameter ε\varepsilon is ~(1ε)\mathaccent 869{\Theta}(\frac{1}{\varepsilon}) when dA=dBrd_{A}=d_{B}r. In particular, for learning unitary channels, our result recovers the optimal lower bound (d2ε)\Omega(\frac{d^{2}}{\varepsilon}) of [haah2023query] up to a logarithmic factor, via a different proof strategy that applies specifically in the coherent setting. Moreover, our approach generalizes to non-unitary channels.

When dA=1d_{A}=1, channel learning becomes state learning, for which the optimal complexity is ~(dBrε2)\mathaccent 869{\Theta}(\frac{d_{B}r}{\varepsilon^{2}}) [ODonnell2016-1, haah2017sample]. This shows that the ε\varepsilon-dependency in our lower bound is not optimal in general.

The main idea to prove our lower bound consists of two steps. The first one is the proof of a general lower bound (Theorem 1), which leverage any arbitrary ensemble of channels {}i\{{}_{i}\} that are pairwise far in diamond distance, yet whose Stinespring isometries are pairwise close in operator norm. The resulting lower bound then scales logarithmically with the size of the ensemble and inversely with the distance of the Stinespring isometries. The second step is the actual construction of a suitable ensemble of channels to prove our lower bound. A natural strategy to this end could be the use of existing packing nets. However, as we show in Appendix A, this approach – although simpler – yields a weaker bound:

N(dAdBrεlog(dBr/ε)).\displaystyle N\geq\Omega\left(\frac{d_{A}d_{B}r}{\sqrt{\varepsilon}\log(d_{B}r/\sqrt{\varepsilon})}\right). (4)

Instead, using a probabilistic approach, we construct a random family of Stinespring isometries which, with positive probability, are sufficiently close in operator norm, but engender channels which are far enough in diamond distance. Such argument ensures the existence of an ensemble that produces the desired lower bound (Theorem 3).

The remainder of the manuscript is organised as follows. In Section 1 we introduce the notation and the definitions that we are going to use in the paper. In Section 2 we state and prove Theorem 1, i.e. the general lower bound on channel learning constructed in terms of ensembles of quantum channels. In Section 3 we prove the lower bound (3) leveraging a family of particular isometries (Lemma 2) in order to construct a suitable ensemble of channels to be used in Theorem 1 (see Theorem 3). The isometries of Lemma 2 are identified using a random construction, which is discussed in Section 4. In the Appendix we provide the proofs that were deferred in the previous sections to improve readability.

1.1 Notation

All the Hilbert spaces that we are going to consider are supposed to be finite-dimensional. Let AdA\mathcal{H}_{A}\cong\mathbb{C}^{d_{A}} and BdB\mathcal{H}_{B}\cong\mathbb{C}^{d_{B}} denote input and output spaces. We write ()\mathcal{L}(\mathcal{H}) for linear operators on \mathcal{H}, and 𝒟()\mathcal{D}(\mathcal{H}) for quantum states (positive semi-definite operators with unit trace). The operator norm is \|X\|op=sup\|ψ\|=1\|X|ψ\|\|X\|_{\mathrm{op}}=\sup_{\|\psi\|=1}\|X\ket{\psi}\|, and the trace norm is \|X\|1=TrXX\|X\|_{1}=\operatorname{Tr}\sqrt{X^{\dagger}X}. For ρ𝒟()\rho\in\mathcal{D}(\mathcal{H}), the von Neumann entropy is S(ρ)=Tr[ρlogρ]S(\rho)=-\operatorname{Tr}[\rho\log\rho]. All logarithms are in base e\mathrm{e}.

1.2 Quantum channels and their representations

A quantum channel :(A)(B)\Phi:\mathcal{L}(\mathcal{H}_{A})\to\mathcal{L}(\mathcal{H}_{B}) is a completely positive trace-preserving (CPTP) map. It can be written in the Kraus representation as (X)=\slimits@i=1rKiXKi\Phi(X)=\sumop\slimits@_{i=1}^{r}K_{i}XK_{i}^{\dagger} with \slimits@iKiKi=𝟙A\sumop\slimits@_{i}K_{i}^{\dagger}K_{i}=\mathds{1}_{A}. The minimal rr is called the Kraus rank.

For any channel , there exists an isometry V:ABEV:\mathcal{H}_{A}\to\mathcal{H}_{B}\otimes\mathcal{H}_{E} (VV=𝟙AV^{\dagger}V=\mathds{1}_{A}) such that (X)=TrE(VXV)\Phi(X)=\operatorname{Tr}_{E}(VXV^{\dagger}). Such isometry VV is called Stinespring dilation of . The minimal dimension of E\mathcal{H}_{E} equals the Kraus rank.

The Choi state of the channel is

J()(𝟙A)(||)(AB),\displaystyle\hskip 0.0ptJ(\Phi)\coloneqq(\mathds{1}_{A^{\prime}}\otimes\Phi)(\ket{\Psi}\bra{\Psi})\in\mathcal{L}(\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{B}), (5)

where |AA1dA\slimits@i=1dA|iA|iA\ket{\Psi}_{A^{\prime}A}\coloneqq\frac{1}{\sqrt{d_{A}}}\sumop\slimits@_{i=1}^{d_{A}}\ket{i}_{A^{\prime}}\otimes\ket{i}_{A} is the normalised maximally entangled state between AA^{\prime} and AA. The linear map is CPTP if and only if J()0J(\Phi)\geq 0 and TrBJ()=𝟙A/dA\operatorname{Tr}_{B}J(\Phi)=\mathds{1}_{A^{\prime}}/d_{A}.

The Choi rank rankChoirank(J())\operatorname{rank}_{\text{Choi}}{\Phi}\coloneqq\operatorname{rank}(J(\Phi)) equals the Kraus rank and the minimal environment dimension of Stinespring dilations.

1.3 Channel ensembles with distance constraints

Let 𝒞(dA,dB,r)\mathcal{C}(d_{A},d_{B},r) denote the set of quantum channels :(A)(B)\Phi:\mathcal{L}(\mathcal{H}_{A})\to\mathcal{L}(\mathcal{H}_{B}) with Choi rank at most rr, i.e.

𝒞(dA,dB,r){ quantum channelrankChoir}.\displaystyle\hskip 0.0pt\mathcal{C}(d_{A},d_{B},r)\coloneqq\{\Phi\text{ quantum channel}\mid\operatorname{rank}_{\text{Choi}}{\Phi}\leq r\}. (6)

For a channel 𝒞(dA,dB,r)\Phi\in\mathcal{C}(d_{A},d_{B},r), let V:ABEV:\mathcal{H}_{A}\to\mathcal{H}_{B}\otimes\mathcal{H}_{E} be a Stinespring isometry with minimal environment dimension dErd_{E}\leq r.

For channels ,:(A)(B)\Phi,\Psi:\mathcal{L}(\mathcal{H}_{A})\to\mathcal{L}(\mathcal{H}_{B}), the diamond distance is defined as

\|\|supρRA𝒟(RA)\|(𝟙R)(ρRA)(𝟙R)(ρRA)\|1,\displaystyle\hskip 0.0pt\|\Phi-\Psi\|_{\diamond}\coloneqq\sup_{\rho_{RA}\in\mathcal{D}(\mathcal{H}_{R}\otimes\mathcal{H}_{A})}\|{(\mathds{1}_{R}\otimes\Phi)(\rho_{RA})-(\mathds{1}_{R}\otimes\Psi)(\rho_{RA})}\|_{1}, (7)

where the supremum is over all auxiliary spaces R\mathcal{H}_{R} and states ρRA\rho_{RA}.

We say that two channels ,𝒞(dA,dB,r)\Phi,\Psi\in\mathcal{C}(d_{A},d_{B},r) are ε\varepsilon-diamond far if

\|\|>ε.\|\Phi-\Psi\|_{\diamond}>\varepsilon. (8)

We say that their Stinespring isometries are η\eta-operator norm close if there exist choices of isometries V,VV,V such that

\|VV\|opη.\|V-V\|_{\mathrm{op}}\leq\eta. (9)

Finally, we define (dA,dB,r,ε,η)\mathcal{E}(d_{A},d_{B},r,\varepsilon,\eta) as the set of ensembles of channels that are pairwise 2ε2\varepsilon-diamond-far but have η\eta-close Stinespring isometries:

(dA,dB,r,ε,η)={{}ii=1M𝒞(dA,dB,r)|ij:\|i\|j>2ε, Stinespring isometries {Vi}i=1Msuch that ij:\|ViVj\|opη}.\mathcal{E}(d_{A},d_{B},r,\varepsilon,\eta)=\left\{\{{}_{i}\}_{i=1}^{M}\subset\mathcal{C}(d_{A},d_{B},r)\;\middle|\;\begin{array}[]{l}\forall i\neq j:\|{}_{i}-{}_{j}\|_{\diamond}>2\varepsilon,\\ \exists\text{ Stinespring isometries }\{V_{i}\}_{i=1}^{M}\\ \text{such that }\forall i\neq j:\|V_{i}-V_{j}\|_{\mathrm{op}}\leq\eta\end{array}\right\}. (10)

1.4 The coherent query model

In the coherent query model for quantum channel learning, the learning algorithm is allowed to interleave queries to the unknown channel with arbitrary, adaptively chosen quantum operations. Formally, the algorithm prepares an initial quantum state ρ\rho on a system comprising the input space A\mathcal{H}_{A} of dimension dAd_{A} together with an auxiliary system aux\mathcal{H}_{\text{aux}} of arbitrary dimension. It then performs NN uses of the unknown channel , interspersed with arbitrary quantum channels {Ni}i=1N1\{\pazocal{N}_{i}\}_{i=1}^{N-1} (the intermediate operations) that act jointly on the output space B\mathcal{H}_{B} and the auxiliary system. The final state after NN queries is

ρoutput=[Idaux]NN1[Idaux]N1[Idaux](ρ),\displaystyle\hskip 0.0pt\rho^{\text{output}}=\bigl[\Phi\otimes{\rm Id}_{\text{aux}}\bigr]\circ\pazocal{N}_{N-1}\circ\bigl[\Phi\otimes{\rm Id}_{\text{aux}}\bigr]\circ\cdots\circ\pazocal{N}_{1}\circ\bigl[\Phi\otimes{\rm Id}_{\text{aux}}\bigr](\rho)\,, (11)

where each 𝟙aux\Phi\otimes\mathds{1}_{\text{aux}} denotes a query to the unknown channel acting on the input system while leaving the auxiliary system unchanged. Finally, the algorithm measures ρoutput\rho^{\text{output}} with a positive operator-valued measure (POVM) to produce a classical description of an estimate hat\hat{\Phi}. The query complexity is the minimum number NN of uses of required to output, with high probability, an estimate hat\hat{\Phi} such that \|hat\|ε\|\Phi-\hat{\Phi}\|_{\diamond}\leq\varepsilon.

This model generalizes the parallel (or non-adaptive) query model, in which all NN uses of are applied in parallel on a (possibly entangled) input state, corresponding to the special case where the intermediate operations Ni\pazocal{N}_{i} are all identity channels. The coherent model captures the most general physically realizable learning procedure that respects causality and does not assume access to the inverse or conjugate of . It is the natural setting for studying the fundamental quantum limits of channel learning when arbitrary quantum processing between queries is allowed.

2 A general lower bound on channel learning

In this section, we prove a general lower bound for learning a general quantum channel in diamond distance.

Theorem 1 ((General lower bound)).

Let dA,dB,r1d_{A},d_{B},r\geq 1, M3M\geq 3 and ε,η(0,1/2)\varepsilon,\eta\in(0,1/2). Consider an ensemble {}ii=1M(dA,dB,r,ε,η)\{{}_{i}\}_{i=1}^{M}\in\mathcal{E}(d_{A},d_{B},r,\varepsilon,\eta) of MM quantum channels that are 2ε2\varepsilon-diamond-far and whose Stinespring isometries are η\eta-operator-norm-close. Any coherent algorithm that constructs ℎ𝑎𝑡i\hat{\Phi}_{i} such that \|iℎ𝑎𝑡i\|ε\|{}_{i}-\hat{\Phi}_{i}\|_{\diamond}\leq\varepsilon with probability at least 2/32/3 for all i[M]i\in[M] needs at least

N=(2/3)log(M)log24ηlog(dBr/η)\displaystyle\hskip 0.0ptN=\left\lceil\frac{(2/3)\log(M)-\log 2}{4\eta\log(d_{B}r/\eta)}\right\rceil (12)

uses of N\pazocal{N}.

We follow a standard strategy for proving lower bounds for learning problems (e.g., [flammia2012quantum, haah2017sample, lowe2022lower, fawzi2023lower, oufkir2023sample, Bluhm2024Mar, Rosenthal2024Sep, Mele_2025]).

Proof.
Refer to caption
X
Refer to caption
ε\varepsilon
Refer to caption
{}ii=1M\{{}_{i}\}_{i=1}^{M}
hatX\hat{\Phi}_{X}
Refer to caption
ViV_{i}
VjV_{j}
η\eta
Refer to caption
Figure 1: Schematic representation of the ensemble {}ii=1M\{{}_{i}\}_{i=1}^{M}.

Let us consider any fixed coherent algorithm that constructs hati\hat{\Phi}_{i} such that \|ihati\|ε\|{}_{i}-\hat{\Phi}_{i}\|_{\diamond}\leq\varepsilon with probability at least 2/32/3 for all i[M]i\in[M]. Let XUniform[M]X\sim{\rm Uniform}[M] and let YY be the index output by such algorithm upon receiving the quantum channel NX{}_{X}^{\otimes N}. Since the quantum channels {}ii=1M\{{}_{i}\}_{i=1}^{M} are pairwise 2ε2\varepsilon-diamond-far, the algorithm should find XX just by picking the ε\varepsilon-closest channel to hatX\hat{\Phi}_{X} in {}ii=1M\{{}_{i}\}_{i=1}^{M} (see Figure 1). Hence, by Fano’s inequality [FANO] we have:

I(X:Y)(2/3)log(M)log2.\displaystyle\hskip 0.0ptI(X:Y)\geq(2/3)\log(M)-\log 2. (13)

A coherent algorithm using the quantum channel ()x{}_{x}(\,\cdot\,) chooses the input state ρ\rho, the channels N1,,NN1\pazocal{N}_{1},\dots,\pazocal{N}_{N-1}, and measures the output state:

σxN=[xId]NN1N1[xId](ρ).\displaystyle\hskip 0.0pt\sigma_{x}^{N}=[{}_{x}\otimes{\rm Id}]\circ\pazocal{N}_{N-1}\circ\cdots\circ\pazocal{N}_{1}\circ[{}_{x}\otimes{\rm Id}](\rho). (14)

We can suppose that the channels {}xk[N]\{{}_{x}\}_{k\in[N]} act on different systems {Ak}k[N]\{A_{k}\}_{k\in[N]} of dimension dAd_{A} (we can include swap channels in N1,,NN1\pazocal{N}_{1},\dots,\pazocal{N}_{N-1} if necessary). We can assume, without loss of generality, that all the channels N1,,NN1\pazocal{N}_{1},\dots,\pazocal{N}_{N-1} are isometries up to modifying the measurement at the end. Similarly, we can suppose that VxAkBk=VxAkBkV_{x}^{A_{k}\to B_{k}}=V_{{}_{x}}^{A_{k}\to B_{k}} is applied instead of x directly after Nk1\pazocal{N}_{k-1} for k=1,,Nk=1,\dots,N. The global system is thus B1BNEB_{1}\cdots B_{N}E where |Bk|=dBr|B_{k}|=d_{B}r and EE is an ancilla system of arbitrary dimension. The global state before measurement becomes

σxN=[VxId]NN1N1[VxId](ρ),\displaystyle\hskip 0.0pt\sigma_{x}^{N}=[\pazocal{V}_{x}\otimes{\rm Id}]\circ\pazocal{N}_{N-1}\circ\cdots\circ\pazocal{N}_{1}\circ[\pazocal{V}_{x}\otimes{\rm Id}](\rho), (15)

with Vx()=Vx()Vx\pazocal{V}_{x}(\cdot)=V_{x}(\cdot)V_{x}^{\dagger}. For k[N]k\in[N], we denote σxk=VxNk1N1Vx(ρ)\sigma_{x}^{k}=\pazocal{V}_{x}\circ\pazocal{N}_{k-1}\circ\cdots\circ\pazocal{N}_{1}\circ\pazocal{V}_{x}(\rho), σx0=ρ\sigma_{x}^{0}=\rho and N0=Id\pazocal{N}_{0}={\rm Id}, so that we have

σxk=VxNk1(σxk1).\displaystyle\hskip 0.0pt\sigma_{x}^{k}=\pazocal{V}_{x}\circ\pazocal{N}_{k-1}(\sigma_{x}^{k-1}). (16)

Denote by πk=1M\slimits@x=1MVxNk1(σxk1)\pi_{k}=\frac{1}{M}\sumop\slimits@_{x=1}^{M}\pazocal{V}_{x}\circ\pazocal{N}_{k-1}(\sigma_{x}^{k-1}) and ξk=1M\slimits@x=1MV1Nk1(σxk1)\xi_{k}=\frac{1}{M}\sumop\slimits@_{x=1}^{M}\pazocal{V}_{1}\circ\pazocal{N}_{k-1}(\sigma_{x}^{k-1}). Hence the mutual information between XX and the observation of the coherent algorithm YY can be bounded as follows

I(X:Y)\displaystyle\hskip 0.0ptI(X:Y) (i)S(1M\slimits@x=1MσxN)1M\slimits@x=1MS(σxN)\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(i)}}}}{{\leq}}S\left(\frac{1}{M}\sumop\slimits@_{x=1}^{M}\sigma_{x}^{N}\right)-\frac{1}{M}\sumop\slimits@_{x=1}^{M}S\left(\sigma_{x}^{N}\right) (17)
=(ii)\slimits@k=1NS(1M\slimits@x=1Mσxk)\slimits@k=1NS(1M\slimits@x=1Mσxk1)\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(ii)}}}}{{=}}\sumop\slimits@_{k=1}^{N}S\left(\frac{1}{M}\sumop\slimits@_{x=1}^{M}\sigma_{x}^{k}\right)-\sumop\slimits@_{k=1}^{N}S\left(\frac{1}{M}\sumop\slimits@_{x=1}^{M}\sigma_{x}^{k-1}\right)
=(iii)\slimits@k=1NS(1M\slimits@x=1MVxNk1(σxk1))\slimits@k=1NS(1M\slimits@x=1MV1Nk1(σxk1))\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(iii)}}}}{{=}}\sumop\slimits@_{k=1}^{N}S\left(\frac{1}{M}\sumop\slimits@_{x=1}^{M}\pazocal{V}_{x}\circ\pazocal{N}_{k-1}(\sigma_{x}^{k-1})\right)-\sumop\slimits@_{k=1}^{N}S\left(\frac{1}{M}\sumop\slimits@_{x=1}^{M}\pazocal{V}_{1}\circ\pazocal{N}_{k-1}(\sigma_{x}^{k-1})\right)
=\slimits@k=1NS(B1Bk1BkAk+1ANE)πk\slimits@k=1NS(B1Bk1BkAk+1ANE)ξk,\displaystyle=\sumop\slimits@_{k=1}^{N}S\left(B_{1}\cdots B_{k-1}B_{k}A_{k+1}\cdots A_{N}E\right)_{\pi_{k}}-\sumop\slimits@_{k=1}^{N}S\left(B_{1}\cdots B_{k-1}B_{k}A_{k+1}\cdots A_{N}E\right)_{\xi_{k}},

where (i) uses Holevo’s theorem [holevo1973bounds] ; (ii) is a telescopic sum and uses the fact that S(σxN)=S(ρ)S\left(\sigma_{x}^{N}\right)=S\left(\rho\right) for all xx as all the applied operations are isometry; (iii) uses the assumption that Nk1\pazocal{N}_{k-1} and V1\pazocal{V}_{1} are isometry channels.

Now, observe that we have that TrBk[πk]=TrBk[ξk]\mathrm{Tr}_{B_{k}}\left[\pi_{k}\right]=\mathrm{Tr}_{B_{k}}\left[\xi_{k}\right] so we can apply the continuity bound of [Berta2024Aug, Theorem 5]

S(B1Bk1BkAk+1ANE)πkS(B1Bk1BkAk+1ANE)ξk\displaystyle S\left(B_{1}\cdots B_{k-1}B_{k}A_{k+1}\cdots A_{N}E\right)_{\pi_{k}}-S\left(B_{1}\cdots B_{k-1}B_{k}A_{k+1}\cdots A_{N}E\right)_{\xi_{k}} (18)
=H(Bk|B1Bk1Ak+1ANE)πkH(Bk|B1Bk1Ak+1ANE)ξk\displaystyle\quad=H\left(B_{k}|B_{1}\cdots B_{k-1}A_{k+1}\cdots A_{N}E\right)_{\pi_{k}}-H\left(B_{k}|B_{1}\cdots B_{k-1}A_{k+1}\cdots A_{N}E\right)_{\xi_{k}}
\|πkξk\|1log(|Bk|2)+h2(\|πkξk\|1)\displaystyle\quad\leq\|\pi_{k}-\xi_{k}\|_{1}\log(|B_{k}|^{2})+h_{2}(\|\pi_{k}-\xi_{k}\|_{1})

with h2(a)=aloga(1a)log(1a)h_{2}(a)=-a\log a-(1-a)\log(1-a) being the binary entropy. We have that

\|πkξk\|1\displaystyle\hskip 0.0pt\|\pi_{k}-\xi_{k}\|_{1} =\|1M\slimits@x=1MVxNk1(σxk1)1M\slimits@x=1MV1Nk1(σxk1)\|1\displaystyle=\left\|\frac{1}{M}\sumop\slimits@_{x=1}^{M}\pazocal{V}_{x}\circ\pazocal{N}_{k-1}(\sigma_{x}^{k-1})-\frac{1}{M}\sumop\slimits@_{x=1}^{M}\pazocal{V}_{1}\circ\pazocal{N}_{k-1}(\sigma_{x}^{k-1})\right\|_{1} (19)
1M\slimits@x=1M\|(VxV1)Nk1(σxk1)\|1\displaystyle\leq\frac{1}{M}\sumop\slimits@_{x=1}^{M}\left\|(\pazocal{V}_{x}-\pazocal{V}_{1})\circ\pazocal{N}_{k-1}(\sigma_{x}^{k-1})\right\|_{1}
=1M\slimits@x=1M\|VxζVxV1ζV1\|1\displaystyle=\frac{1}{M}\sumop\slimits@_{x=1}^{M}\left\|V_{x}\zeta V_{x}^{\dagger}-V_{1}\zeta V_{1}^{\dagger}\right\|_{1}

with ζ=Nk1(σxk1)\zeta=\pazocal{N}_{k-1}(\sigma_{x}^{k-1}) being a quantum state. Using the triangle inequality, we obtain

1M\slimits@x=1M\|VxζVxV1ζV1\|1\displaystyle\hskip 0.0pt\frac{1}{M}\sumop\slimits@_{x=1}^{M}\left\|V_{x}\zeta V_{x}^{\dagger}-V_{1}\zeta V_{1}^{\dagger}\right\|_{1} 1M\slimits@x=1M(\|(VxV1)ζVx\|1+\|V1ζ(VxV1)\|1)\displaystyle\leq\frac{1}{M}\sumop\slimits@_{x=1}^{M}\left(\left\|(V_{x}-V_{1})\zeta V_{x}^{\dagger}\right\|_{1}+\left\|V_{1}\zeta(V_{x}-V_{1})^{\dagger}\right\|_{1}\right) (20)
1M\slimits@x=1M(\|VxV1\|op\|ζVx\|1+\|VxV1\|op\|V1ζ\|1)\displaystyle\leq\frac{1}{M}\sumop\slimits@_{x=1}^{M}\left(\|V_{x}-V_{1}\|_{\mathrm{op}}\left\|\zeta V_{x}^{\dagger}\right\|_{1}+\|V_{x}-V_{1}\|_{\mathrm{op}}\left\|V_{1}\zeta\right\|_{1}\right)
2η.\displaystyle\leq 2\eta.

Therefore, we deduce

I(X:Y)\displaystyle\hskip 0.0ptI(X:Y) \slimits@k=1NS(B1Bk1BkAk+1ANE)πk\slimits@k=1NS(B1Bk1BkAk+1ANE)ξk\displaystyle\leq\sumop\slimits@_{k=1}^{N}S\left(B_{1}\cdots B_{k-1}B_{k}A_{k+1}\cdots A_{N}E\right)_{\pi_{k}}-\sumop\slimits@_{k=1}^{N}S\left(B_{1}\cdots B_{k-1}B_{k}A_{k+1}\cdots A_{N}E\right)_{\xi_{k}} (21)
\slimits@k=1N\|πkξk\|1log(|Bk|2)+h2(\|πkξk\|1)\displaystyle\leq\sumop\slimits@_{k=1}^{N}\|\pi_{k}-\xi_{k}\|_{1}\log(|B_{k}|^{2})+h_{2}(\|\pi_{k}-\xi_{k}\|_{1})
\slimits@k=1N2ηlog((dBr)2)+h2(2η)\displaystyle\leq\sumop\slimits@_{k=1}^{N}2\eta\log((d_{B}r)^{2})+h_{2}(2\eta)
4Nηlog(dBr/η),\displaystyle\leq 4N\eta\log(d_{B}r/\eta),

where we used that h2(a)2alog(1/a)h_{2}(a)\leq 2a\log(1/a) for a(0,12)a\in(0,\frac{1}{2}). Since I(X:Y)(2/3)log(M)log2I(X:Y)\geq(2/3)\log(M)-\log 2 we deduce that

N(2/3)log(M)log24ηlog(dBr/η).\displaystyle\hskip 0.0ptN\geq\frac{(2/3)\log(M)-\log 2}{4\eta\log(d_{B}r/\eta)}. (22)

This concludes the proof. ∎

Given Theorem 1, we can prove lower bounds on learning quantum channels by constructing an ensemble {}ii=1M\{{}_{i}\}_{i=1}^{M} within (dA,dB,r,ε,η)\mathcal{E}(d_{A},d_{B},r,\varepsilon,\eta) containing MM quantum channels that are pairwise 2ε2\varepsilon-diamond-far, yet whose Stinespring isometries are pairwise η\eta-operator-norm-close. The resulting lower bound then scales with MM and inversely with η\eta. To strengthen this bound, we should aim to construct an ensemble that maximizes MM while minimizing η\eta111Note that by the inequality \|x\|y\|VxVy\|op\|{}_{x}-{}_{y}\|_{\diamond}\leq\|V_{x}-V_{y}\|_{\mathrm{op}} [kretschmann2008information], the parameter η\eta should be at least 2ε2\varepsilon..

A natural approach to constructing such an ensemble is to use existing packing nets. However, this leads to an ensemble in (dA,dB,r,ε,4ε)\mathcal{E}(d_{A},d_{B},r,\varepsilon,4\sqrt{\varepsilon}) of cardinality MM satisfying, logM=(dAdBr)\log M=\Omega(d_{A}d_{B}r) and by Theorem 1 implies the following weak lower bound (see Appendix A for details):

N(dAdBrεlog(dBr/ε)).\displaystyle N\geq\Omega\left(\frac{d_{A}d_{B}r}{\sqrt{\varepsilon}\log(d_{B}r/\sqrt{\varepsilon})}\right). (23)

In what follows, we improve the ε\varepsilon-dependence of this lower bound by constructing a new ensemble in (dA,dB,r,ε,η)\mathcal{E}(d_{A},d_{B},r,\varepsilon,\eta) with comparable cardinality but with η=O(ε)\eta=O(\varepsilon) rather than O(ε)O(\sqrt{\varepsilon}).

3 An ensemble yielding an improved lower bound

Refer to caption
V~0\mathaccent 869{V}_{0}
Refer to caption
O(ε)O(\varepsilon)
Refer to caption
V1V_{1}
VMV_{M}
Refer to caption
V~1\mathaccent 869{V}_{1}
V~M\mathaccent 869{V}_{M}
Refer to caption
V0V_{0}
x
Refer to caption
UMU_{M}
Refer to caption
VxV_{x}
Refer to caption
(ε)\Omega(\varepsilon)
. . .
U1U_{1}
Refer to caption
Figure 2: Schematic construction of the isometries VxV_{x} and of the channels x, where UxHaar(U(rdB))U_{x}\sim{\rm Haar}({\rm U}(rd_{B})), V~x=UxS\mathaccent 869{V}_{x}=U_{x}S, Vx=1ε2|0V~0+ε|1V~xV_{x}=\sqrt{1-\varepsilon^{2}}\ket{0}\otimes\mathaccent 869{V}_{0}+\varepsilon\ket{1}\otimes\mathaccent 869{V}_{x}, ()x=TrE[V~xV~x]{}_{x}(\,\cdot\,)=\operatorname{Tr}_{E}[\mathaccent 869{V}_{x}\,\cdot\,\mathaccent 869{V}_{x}^{\dagger}].

In this section, we improve the bound (23) by constructing an ensemble (dA,dB,r,(ε),2ε)\mathcal{E}(d_{A},d_{B},r,\Omega(\varepsilon),2\varepsilon) with cardinality MM satisfying logM=(dAdBr)\log M=\Omega(d_{A}d_{B}r). More precisely, we construct a set of isometries {Vx}x[M]\{{V}_{x}\}_{x\in[M]} corresponding to quantum channels {}xx[M]\{{}_{x}\}_{x\in[M]} such that \|x\|y(ε)\|{}_{x}-{}_{y}\|_{\diamond}\geq\Omega(\varepsilon), \|VxVy\|O(ε)\|{V}_{x}-{V}_{y}\|_{\infty}\leq O(\varepsilon), and logM=(dAdBr)\log M=\Omega(d_{A}d_{B}r), as in Figure 2.

We prove the existence of such a set using a probabilistic argument. Let ~0\mathaccent 869{\Phi}_{0} be a quantum channel with Kraus operators {K0,i}i[r]\{K_{0,i}\}_{i\in[r]} satisfying

|Tr[K0,iK0,j]|2dArδi,j,i,j[r].\displaystyle\left|\operatorname{Tr}\left[K_{0,i}^{\dagger}K_{0,j}\right]\right|\leq\frac{2d_{A}}{r}\delta_{i,j}\,,\quad\forall i,j\in[r]. (24)

The existence of such a channel is shown in Appendix B. Let V~0ABE=\slimits@i=1r|iEK0,i\mathaccent 869{V}_{0}^{A\to BE}=\sumop\slimits@_{i=1}^{r}\ket{i}_{E}\otimes K_{0,i} be a Stinespring isometry of the quantum channel ~0\mathaccent 869{\Phi}_{0}.

Lemma 2.

There exists a set of isometries {V~xABE}x[M]\{\mathaccent 869{V}^{A\to BE}_{x}\}_{x\in[M]} such that logM=11201dAdBr\log M=\frac{1}{1201}d_{A}d_{B}r and, for all xyx\neq y, we have

\|TrE[V~0(V~xV~y)AA]\|10.05,\displaystyle\left\|\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{0}{}_{A^{\prime}A}(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}^{\dagger})\right]\right\|_{1}\geq 0.05, (25)

where AA||AA{}_{A^{\prime}A}\coloneqq\ket{\Psi}\!\!\bra{\Psi}_{A^{\prime}A} is the maximally entangled state with AAA^{\prime}\simeq A.

Proof.

The proof of this lemma is deferred to Section 4. ∎

We now have all the ingredients to prove the main result.

Theorem 3 ((Improved lower bound for channel learning)).
Let dA,dB,r1d_{A},d_{B},r\geq 1 such that dAdBr2500d_{A}d_{B}r\geq 2500 and ε(0,104)\varepsilon\in(0,10^{-4}). Any coherent algorithm that constructs Nℎ𝑎𝑡\hat{\pazocal{N}} such that \|NNℎ𝑎𝑡\|ε\|\pazocal{N}-\hat{\pazocal{N}}\|_{\diamond}\leq\varepsilon with probability at least 2/32/3 needs at least N=(dAdBrεlog(dBr/ε))\displaystyle\hskip 0.0ptN=\Omega\left(\frac{d_{A}d_{B}r}{\varepsilon\log(d_{B}r/\varepsilon)}\right) (26) uses of N\pazocal{N}.
Proof.

Given the set of isometries {V~xABE}x[M]\{\mathaccent 869{V}^{A\to BE}_{x}\}_{x\in[M]} provided by Lemma 2, we define the isometries

VxAFBE1ε2|0FV~0ABE+ε|1FV~xABE\displaystyle\hskip 0.0ptV^{A\to FBE}_{x}\coloneqq\sqrt{1-\varepsilon^{2}}\ket{0}_{F}\otimes\mathaccent 869{V}_{0}^{A\to BE}+\varepsilon\ket{1}_{F}\otimes\mathaccent 869{V}_{x}^{A\to BE} (27)

and let ()xTrE[VxVx]{}_{x}(\,\cdot\,)\coloneqq\mathrm{Tr}_{E}\left[{V}_{x}\,\cdot\,{V}_{x}^{\dagger}\right] the corresponding quantum channel. It has Kraus rank at most |E|=r|E|=r. The quantum channel x has input system AA of dimension dAd_{A} and output system FBFB of dimension 2dB2d_{B}. We have that

\|VxVy\|\displaystyle\hskip 0.0pt\|{V}_{x}-{V}_{y}\|_{\infty} =\|ε|1V~xε|1V~y\|=ε\|V~xV~y\|2ε.\displaystyle=\|\varepsilon\ket{1}\otimes\mathaccent 869{V}_{x}-\varepsilon\ket{1}\otimes\mathaccent 869{V}_{y}\|_{\infty}=\varepsilon\|\mathaccent 869{V}_{x}-\mathaccent 869{V}_{y}\|_{\infty}\leq 2\varepsilon. (28)

On the other hand, lower bounding the diamond norm by choosing the input state to be the maximally entangled state AA{}_{A^{\prime}A}, we have

\|x\|y\displaystyle\hskip 0.0pt\left\|{}_{x}-{}_{y}\right\|_{\diamond} \|()x()y\|1\displaystyle\geq\left\|{}_{x}(\Psi)-{}_{y}(\Psi)\right\|_{1} (29)
=\|TrE[Vx()Vx]TrE[Vy()Vy]\|1\displaystyle=\left\|\mathrm{Tr}_{E}\left[{V}_{x}(\Psi){V}_{x}^{\dagger}\right]-\mathrm{Tr}_{E}\left[{V}_{y}(\Psi){V}_{y}^{\dagger}\right]\right\|_{1}
=\|ε2|11|(TrE[V~xV~x]TrE[V~yV~y])\displaystyle=\bigg\|\varepsilon^{2}\ket{1}\!\!\bra{1}\otimes\left(\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{x}\Psi\mathaccent 869{V}_{x}^{\dagger}\right]-\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{y}\Psi\mathaccent 869{V}_{y}^{\dagger}\right]\right)
+ε1ε2|01|TrE[V~0(V~xV~y)]\displaystyle\qquad+\varepsilon\sqrt{1-\varepsilon^{2}}\ket{0}\bra{1}\otimes\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{0}\Psi(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}^{\dagger})\right]
+ε1ε2|10|TrE[(V~xV~y)V~0]\|1\displaystyle\qquad+\varepsilon\sqrt{1-\varepsilon^{2}}\ket{1}\bra{0}\otimes\mathrm{Tr}_{E}\left[(\mathaccent 869{V}_{x}-\mathaccent 869{V}_{y})\Psi\mathaccent 869{V}_{0}^{\dagger}\right]\bigg\|_{1}
(i)ε1ε2\||01|TrE[V~0(V~xV~y)]+|10|TrE[(V~xV~y)V~0]\|1\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(i)}}}}{{\geq}}\varepsilon\sqrt{1-\varepsilon^{2}}\left\|\ket{0}\bra{1}\otimes\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{0}\Psi(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}^{\dagger})\right]+\ket{1}\bra{0}\otimes\mathrm{Tr}_{E}\left[(\mathaccent 869{V}_{x}-\mathaccent 869{V}_{y})\Psi\mathaccent 869{V}_{0}^{\dagger}\right]\right\|_{1}
ε2\||11|(V~xV~xV~yV~y)\|1\displaystyle\quad-\varepsilon^{2}\left\|\ket{1}\!\!\bra{1}\otimes(\mathaccent 869{V}_{x}\Psi\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}\Psi\mathaccent 869{V}_{y}^{\dagger})\right\|_{1}
=(ii)2ε1ε2\|TrE[V~0(V~xV~y)]\|1ε2\|V~xV~xV~yV~y\|1\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(ii)}}}}{{=}}2\varepsilon\sqrt{1-\varepsilon^{2}}\left\|\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{0}\Psi(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}^{\dagger})\right]\right\|_{1}-\varepsilon^{2}\left\|\mathaccent 869{V}_{x}\Psi\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}\Psi\mathaccent 869{V}_{y}^{\dagger}\right\|_{1}
(iii)0.1ε1ε22ε2\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(iii)}}}}{{\geq}}1\varepsilon\sqrt{1-\varepsilon^{2}}-2\varepsilon^{2}
(iv)0.07ε,\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(iv)}}}}{{\geq}}07\varepsilon,

where in (i) we have used the reverse triangle inequality and the bound

\|TrE[XFBE]\|1\|XFBE\|1,\displaystyle\hskip 0.0pt\big\|\mathrm{Tr}_{E}\left[X_{FBE}\right]\big\|_{1}\leq\big\|X_{FBE}\big\|_{1}, (30)

which holds for every operator XFBEX_{FBE} and follows from the data-processing inequality for the trace norm; in (ii) we have noticed that

\||01|XBE+|10|XBE\|1\displaystyle\big\|\ket{0}\bra{1}\otimes X_{BE}+\ket{1}\bra{0}\otimes X_{BE}^{\dagger}\big\|_{1} (31)
=Tr(|01|XBE+|10|XBE)(|01|XBE+|10|XBE)\displaystyle\quad=\operatorname{Tr}\sqrt{\left(\ket{0}\bra{1}\otimes X_{BE}+\ket{1}\bra{0}\otimes X_{BE}^{\dagger}\right)^{\dagger}\left(\ket{0}\bra{1}\otimes X_{BE}+\ket{1}\bra{0}\otimes X_{BE}^{\dagger}\right)}
=Tr|11|XBEXBE+|00|XBEXBE\displaystyle\quad=\operatorname{Tr}\sqrt{\ket{1}\!\!\bra{1}\otimes X_{BE}^{\dagger}X_{BE}+\ket{0}\!\!\bra{0}\otimes X_{BE}X_{BE}^{\dagger}}
=TrXBEXBE+TrXBEXBE\displaystyle\quad=\operatorname{Tr}\sqrt{X_{BE}^{\dagger}X_{BE}}+\operatorname{Tr}\sqrt{X_{BE}X_{BE}^{\dagger}}
=2\|XBE\|1;\displaystyle\quad=2\|X_{BE}\|_{1};

in (iii) we have upper bounded \|V~xV~xV~yV~y\|1\|V~xV~x\|1+\|V~yV~y\|1=2\|\mathaccent 869{V}_{x}\Psi\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}\Psi\mathaccent 869{V}_{y}^{\dagger}\|_{1}\leq\|\mathaccent 869{V}_{x}\Psi\mathaccent 869{V}_{x}^{\dagger}\|_{1}+\|\mathaccent 869{V}_{y}\Psi\mathaccent 869{V}_{y}^{\dagger}\|_{1}=2; finally, in (iv) we used ε0.01\varepsilon\leq 0.01.

To sum up, we showed the existence of {}xx[M](dA,2dB,r,0.035ε,2ε)\{{}_{x}\}_{x\in[M]}\in\mathcal{E}(d_{A},2d_{B},r,0.035\varepsilon,2\varepsilon) with cardinality MM satisfying logM=11201dAdBr\log M=\frac{1}{1201}d_{A}d_{B}r. This implies the existence of {}xx[M](dA,dB,r,ε,60ε)\{{}_{x}\}_{x\in[M]}\in\mathcal{E}(d_{A},d_{B},r,\varepsilon,60\varepsilon) with cardinality MM satisfying logM=121201dAdBr\log M=\frac{1}{2\cdot 1201}d_{A}d_{B}r for ε104\varepsilon\leq 10^{-4}. By Theorem 1, we conclude:

N(2/3)log(M)log24ηlog(dBr/η)dAdBr2500106εlog(dBr/60ε),\displaystyle\hskip 0.0ptN\geq\frac{(2/3)\log(M)-\log 2}{4\eta\log(d_{B}r/\eta)}\geq\frac{d_{A}d_{B}r-2500}{10^{6}\varepsilon\cdot\log(d_{B}r/60\varepsilon)}, (32)

which completes the proof. ∎

4 Proof of Lemma 2

In order to provide the set of isometries {V~xABE}x[M]\{\mathaccent 869{V}^{A\to BE}_{x}\}_{x\in[M]} claimed in the statement of Lemma 2, we are going use the following construction. Let us introduce

SABE\displaystyle\hskip 0.0ptS^{A\to BE} \slimits@i=1dA|iBEi|ArdB×dAandV~xABEUxBESABE,\displaystyle\coloneqq\sumop\slimits@_{i=1}^{d_{A}}\ket{i}_{BE}\bra{i}_{A}\in\mathbb{C}^{rd_{B}\times d_{A}}\qquad\text{and}\qquad\mathaccent 869{V}_{x}^{A\to BE}\coloneqq U_{x}^{BE}S^{A\to BE}, (33)

where UxU(rdB)U_{x}\in{\rm U}(rd_{B}) is a unitary operator. In our construction, we will consider independent random unitaries UxU_{x} sampled according to the Haar measure. We are going to leverage the following technical statements in the proof of Lemma 2.

Lemma 4.

Let Ux,UyU(rdB))U_{x},U_{y}\in{\rm U}(rd_{B})) and let V~x,V~y\mathaccent 869{V}_{x},\mathaccent 869{V}_{y} be defined as in (33). Then, let us define the operator

CTrE[V~0(V~xV~y)AA],\displaystyle\hskip 0.0ptC\coloneqq\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{0}{}_{A^{\prime}A}(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}^{\dagger})\right], (34)

where AA{}_{A^{\prime}A} is the maximally entangled state between the systems AA and AA^{\prime}. The function f(Ux,Uy)\|C\|1f(U_{x},U_{y})\coloneqq\left\|C\right\|_{1} is 2dA\sqrt{\frac{2}{d_{A}}}-Lipschitz with respect to the 2\ell_{2}-sum of the 2-norms, namely

|f(Ux,Uy)f(Ux,Uy)|2dA\|(Ux,Uy)(Ux,Uy)\|2\displaystyle\hskip 0.0pt|f(U_{x},U_{y})-f(U^{\prime}_{x},U^{\prime}_{y})|\leq\sqrt{\frac{2}{d_{A}}}\|(U_{x},U_{y})-(U_{x}^{\prime},U_{y}^{\prime})\|_{2} (35)

for all Ux,Ux,Uy,UyU(rdB)U_{x},U_{x}^{\prime},U_{y},U_{y}^{\prime}\in{\rm U}(rd_{B}), where \|(A,B)\|2\|A\|22+\|B\|22\|(A,B)\|_{2}\coloneqq\sqrt{\|A\|_{2}^{2}+\|B\|_{2}^{2}} is the 2\ell_{2}-sum of the 2-norms. Furthermore, if we consider independent random unitaries Ux,UyHaar(U(rdB))U_{x},U_{y}\sim{\rm Haar}({\rm U}(rd_{B})), we have

  1. (a)

    𝔼Tr[|C|2]=2r\displaystyle{\mathbb{E}\operatorname{Tr}\big[|C|^{2}\big]=\frac{2}{r}},

  2. (b)

    𝔼Tr[|C|4]128r3\displaystyle{\mathbb{E}\operatorname{Tr}\big[|C|^{4}\big]\leq\frac{128}{r^{3}}}.

Proof.

See Appendix C. ∎

Lemma 5 ([meckes2013spectral, Corollary 17]).

Let k,d1k,d\geq 1. Suppose that f:(U(d))kf:\big({\rm U}(d)\big)^{k}\to\mathbb{R} is LL-Lipschitz with respect to the 2\ell_{2}-sum of the 2-norms, i.e.

|f(U1,,Uk)f(U1,,Uk)|L\slimits@i=1k\|UiUi\|22\displaystyle\hskip 0.0pt\big|f(U_{1},\dots,U_{k})-f(U^{\prime}_{1},\dots,U^{\prime}_{k})\big|\leq L\sqrt{\sumop\slimits@_{i=1}^{k}\|U_{i}-U_{i}^{\prime}\|_{2}^{2}} (36)

for all Ui,UiU(d)U_{i},U_{i}^{\prime}\in{\rm U}(d), with i=1,,ki=1,\dots,k. Then, if we independently sample U1,,UkU_{1},\dots,U_{k} according to the Haar measure on U(d){\rm U}(d), the following inequality holds for each t>0t>0:

(f(U1,,Uk)𝔼[f(U1,,Uk)]+t)exp(dt212L2).\displaystyle\hskip 0.0pt\mathbb{P}\left(f(U_{1},\dots,U_{k})\geq\mathbb{E}\left[f(U_{1},\dots,U_{k})\right]+t\right)\leq\exp\left(-\frac{dt^{2}}{12L^{2}}\right). (37)

Now we have all the ingredients to prove Lemma 2.

Proof of Lemma 2..

Let CC as in Lemma 4. By Hölder’s inequality applied to 𝔼Tr[]\mathbb{E}\operatorname{Tr}[\,\cdot\,] with conjugate exponents 33 and 3/23/2, we get

𝔼[Tr[|C|2]]=𝔼[Tr[|C|4/3|C|2/3]](𝔼[Tr[|C|4]])1/3(𝔼[Tr[|C|]])2/3,\displaystyle\hskip 0.0pt\mathbb{E}[\operatorname{Tr}\big[|C|^{2}\big]]=\mathbb{E}[\operatorname{Tr}\big[|C|^{4/3}|C|^{2/3}\big]]\leq\left(\mathbb{E}\left[\operatorname{Tr}\big[|C|^{4}\big]\right]\right)^{1/3}\left(\mathbb{E}\left[\operatorname{Tr}\big[|C|\big]\right]\right)^{2/3}, (38)

which yields

(𝔼[Tr[|C|2]])3𝔼[Tr[|C|4]](𝔼[Tr[|C|]])2.\displaystyle\hskip 0.0pt\big(\mathbb{E}[\operatorname{Tr}\big[|C|^{2}\big]]\big)^{3}\leq\mathbb{E}\left[\operatorname{Tr}\big[|C|^{4}\big]\right]\left(\mathbb{E}\left[\operatorname{Tr}\big[|C|\big]\right]\right)^{2}. (39)

Whence, by the bounds (a) and (b) of Lemma 4 combined with (39), we have

(𝔼Tr[|C|])2(𝔼Tr[|C|2])3𝔼Tr[|C|4](2r)3128r3=116.\displaystyle\big(\mathbb{E}\operatorname{Tr}\big[|C|\big]\big)^{2}\geq\frac{\big(\mathbb{E}\operatorname{Tr}\big[|C|^{2}\big]\big)^{3}}{\mathbb{E}\operatorname{Tr}\big[|C|^{4}\big]}\geq\frac{(\frac{2}{r})^{3}}{\frac{128}{r^{3}}}=\frac{1}{16}. (40)

Furthermore, since the function f(Ux,Uy)\|C\|1f(U_{x},U_{y})\coloneqq\|C\|_{1} is 2dA\sqrt{\frac{2}{d_{A}}}-Lipschitz, when we sample two independent unitaries Ux,UyHaar(U(rdB))U_{x},U_{y}\sim{\rm Haar}({\rm U}(rd_{B})), by Lemma 5, we have222To be precise, we are applying Lemma 5 to f-f, which is also 2dA\sqrt{\frac{2}{d_{A}}}-Lipschitz.

(𝔼[f(Ux,Uy)]f(Ux,Uy)15)exp(rdB3002dA)=exp(dAdBr600)δ\displaystyle\mathbb{P}\left(\mathbb{E}\left[f(U_{x},U_{y})\right]-f(U_{x},U_{y})\geq\frac{1}{5}\right)\leq\exp\left(-\frac{rd_{B}}{300\cdot\frac{2}{d_{A}}}\right)=\exp\left(-\frac{d_{A}d_{B}r}{600}\right)\eqcolon\delta (41)

Therefore, with probability at least 1δ1-\delta, we have

f(Ux,Uy)>𝔼[f(Ux,Uy)]15(i)11615=120.\displaystyle f(U_{x},U_{y})>\mathbb{E}\left[f(U_{x},U_{y})\right]-\frac{1}{5}\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(i)}}}}{{\geq}}\sqrt{\frac{1}{16}}-\frac{1}{5}=\frac{1}{20}. (42)

where in (i) we have used the lower bound (40). Let

Mexp(dAdBr1201)\displaystyle\hskip 0.0ptM\coloneqq\left\lfloor\exp\left(\frac{d_{A}d_{B}r}{1201}\right)\right\rfloor (43)

and let {Ux}x[M]\{U_{x}\}_{x\in[M]} be i.i.d. Haar random matrices. Note that, by their very definitions, M2δ<1M^{2}\delta<1 and logM=(dAdBr)\log M=\Omega(d_{A}d_{B}r). By the union bound, we have

(xy:f(Ux,Uy)<120)\displaystyle\hskip 0.0pt\mathbb{P}\left(\exists x\neq y:f(U_{x},U_{y})<\frac{1}{20}\right) M(M1)(f(Ux,Uy)<120)M2δ<1.\displaystyle\leq M(M-1)\;\mathbb{P}\left(f(U_{x},U_{y})<\frac{1}{20}\right)\leq M^{2}\delta<1. (44)

Hence, there exists a family {Ux}x[M]\{U_{x}\}_{x\in[M]} such that, for all xyx\neq y,

f(Ux,Uy)=\|TrE[V~0||AA(V~xV~y)]\|1120.\displaystyle f(U_{x},U_{y})=\left\|\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{0}\ket{\Psi}\!\!\bra{\Psi}_{A^{\prime}A}(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}^{\dagger})\right]\right\|_{1}\geq\frac{1}{20}. (45)

This concludes the proof of Lemma 2. ∎

5 Conclusion

We have proved that learning an unknown quantum channel to diamond distance ε\varepsilon requires (dAdBr/(εlog(dBr/ε)))\Omega\bigl(d_{A}d_{B}r/(\varepsilon\log(d_{B}r/\varepsilon))\bigr) queries, improving upon the previous (dAdBr)\Omega(d_{A}d_{B}r) bound. The proof constructs ensembles of channels that are well-separated in diamond norm yet admit Stinespring isometries that are close in operator norm.

Several natural questions remain open. First, the precise ε\varepsilon-dependence in the general case is still unclear: while our bound scales as ~(1/ε)\mathaccent 869{\Omega}(1/\varepsilon), the state-learning regime suggests (1/ε2)\Theta(1/\varepsilon^{2}) is necessary in some parameter ranges. Second, it is unknown whether coherent strategies offer any advantage over parallel strategies for channel learning in diamond distance. Finally, the role of quantum memory in the query complexity requires further exploration.

Acknowledgments

FG acknowledges financial support from the European Union (ERC StG ETQO, Grant Agreement no. 101165230).

References

Appendix A A weaker lower bound using existing packing nets

A natural approach to constructing such an ensemble in (dA,dB,r,ε,η)\mathcal{E}(d_{A},d_{B},r,\varepsilon,\eta) is to use packing nets. Assume that dB2d_{B}\geq 2. From [Girardi2025Dec, Lemma 14], we have

log(𝒞(dA,dB,r2),\|\|, 1/2)=(rdAdB),\displaystyle\hskip 0.0pt\log\mathcal{M}\left(\mathcal{C}(d_{A},d_{B},\lfloor\tfrac{r}{2}\rfloor),\ \|\cdot\|_{\diamond},\ 1/2\right)=\Theta\left(r\,d_{A}d_{B}\right), (46)

where (𝒮,||,δ)\mathcal{M}(\mathcal{S},|\cdot|,\delta) denotes the δ\delta-packing number of the set 𝒮\mathcal{S} with respect to the norm |||\cdot|.

Let M=(𝒞(dA,dB,r2),\|\|, 1/2)M=\mathcal{M}\big(\mathcal{C}(d_{A},d_{B},\lfloor\frac{r}{2}\rfloor),\ \|\cdot\|_{\diamond},\ 1/2\big), and let {~x}x[M]\{\mathaccent 869{\Phi}_{x}\}_{x\in[M]} be a 1/21/2-diamond-norm packing of quantum channels, with corresponding Stinespring isometries {V~x}x[M]\{\mathaccent 869{V}_{x}\}_{x\in[M]}.

For a given ε(0,14)\varepsilon\in(0,\frac{1}{4}) and each x[M]x\in[M], we define the convex mixture

=x(14ε)~1+4ε~x.\displaystyle\hskip 0.0pt{}_{x}=(1-4\varepsilon)\mathaccent 869{\Phi}_{1}+4\varepsilon\mathaccent 869{\Phi}_{x}. (47)

This is a valid quantum channel of Choi rank at most 2r2r2\lfloor\frac{r}{2}\rfloor\leq r. We observe that for any distinct x,y[M]x,y\in[M],

\|x\|y\displaystyle\hskip 0.0pt\|{}_{x}-{}_{y}\|_{\diamond} =4ε\|~x~y\|>2ε,\displaystyle=4\varepsilon\|\mathaccent 869{\Phi}_{x}-\mathaccent 869{\Phi}_{y}\|_{\diamond}>2\varepsilon, (48)

since \|~x~y\|>1/2\|\mathaccent 869{\Phi}_{x}-\mathaccent 869{\Phi}_{y}\|_{\diamond}>1/2 by the packing property. Moreover, by [kretschmann2008information], we have

infVx\|VxV1\|op2\|x\|1=4ε\|~x~1\|4ε,\displaystyle\hskip 0.0pt\inf_{V_{{}_{x}}}\|V_{{}_{x}}-V_{1}\|_{\mathrm{op}}^{2}\leq\|{}_{x}-{}_{1}\|_{\diamond}=4\varepsilon\|\mathaccent 869{\Phi}_{x}-\mathaccent 869{\Phi}_{1}\|_{\diamond}\leq 4\varepsilon, (49)

where V1V_{1} is a Stinespring isometry for 1. Let VxV_{x} be a Stinespring isometry for x achieving this infimum. Then for all x,y[M]x,y\in[M],

\|VxVy\|op\displaystyle\hskip 0.0pt\|V_{x}-V_{y}\|_{\mathrm{op}} \|VxV1\|op+\|VyV1\|op4ε.\displaystyle\leq\|V_{x}-V_{1}\|_{\mathrm{op}}+\|V_{y}-V_{1}\|_{\mathrm{op}}\leq 4\sqrt{\varepsilon}. (50)

Thus, {}xx[M](dA,dB,r,ε,4ε)\{{}_{x}\}_{x\in[M]}\in\mathcal{E}(d_{A},d_{B},r,\varepsilon,4\sqrt{\varepsilon}), which implies the lower bound from Theorem 1

N(2/3)log(M)log24ηlog(dBr/η)(dAdBrεlog(dBr/ε)).\displaystyle N\geq\frac{(2/3)\log(M)-\log 2}{4\eta\log(d_{B}r/\eta)}\geq\Omega\left(\frac{d_{A}d_{B}r}{\sqrt{\varepsilon}\log(d_{B}r/\sqrt{\varepsilon})}\right). (51)

Appendix B Existence of the quantum channel ~0\mathaccent 869{\Phi}_{0}

In this section, we want to show the existence of a quantum channel ~0\mathaccent 869{\Phi}_{0} with Kraus operators {K0,i}i[r]\{K_{0,i}\}_{i\in[r]} satisfying

|Tr[K0,iK0,j]|2dArδi,j,i,j[r].\displaystyle\left|\operatorname{Tr}\left[K_{0,i}^{\dagger}K_{0,j}\right]\right|\leq\frac{2d_{A}}{r}\delta_{i,j}\,,\quad\forall i,j\in[r]. (52)

We make cases depending on whether dAdBd_{A}\leq d_{B} or not.

  • Case 11: dAdBd_{A}\leq d_{B}, let k=dBdA[1,r]k=\bigl\lfloor\frac{d_{B}}{d_{A}}\bigr\rfloor\in[1,r] and l=rkl=\bigl\lceil\frac{r}{k}\bigr\rceil. Note that lrdAdBdA2l\leq\bigl\lceil\frac{rd_{A}}{d_{B}}\bigr\rceil\leq d_{A}^{2}. We decompose dB(\slimits@i=1kdA)dC\mathbb{C}^{d_{B}}\simeq\bigl(\bigoplusop\slimits@_{i=1}^{k}\mathbb{C}^{d_{A}}\bigr)\oplus\mathbb{C}^{d_{C}}, where dC=dBkdA<dAd_{C}=d_{B}-kd_{A}<d_{A}.

    For each block AiAA_{i}\simeq A (i=1,,ki=1,\dots,k), we can choose ldA2l\leq d_{A}^{2} orthogonal dA×dAd_{A}\times d_{A} unitary matrices {Ui,j}j[l]\{U_{i,j}\}_{j\in[l]} (for example, a subset of the generalized Pauli operators). Since kl=krkrkl=k\bigl\lceil\frac{r}{k}\bigr\rceil\geq r, we may select a subset S[k]×[l]S\subset[k]\times[l] with |S|=r|S|=r. For each (i,j)S(i,j)\in S, define the Kraus operator

    Ki,j=(01rUi,j),\displaystyle\hskip 0.0ptK_{i,j}=\left(0\oplus\tfrac{1}{\sqrt{r}}\,U_{i,j}\right), (53)

    where the direct sum is taken with respect to the decomposition above, and Ui,jU_{i,j} acts nontrivially only on the ii-th dA\mathbb{C}^{d_{A}} summand.

    We then verify:

    • (a)

      Completeness:

      \slimits@(i,j)SKi,jKi,j=\slimits@(i,j)S1rUi,jUi,j=𝕀A.\displaystyle\hskip 0.0pt\sumop\slimits@_{(i,j)\in S}K_{i,j}^{\dagger}K_{i,j}=\sumop\slimits@_{(i,j)\in S}\frac{1}{r}\,U_{i,j}^{\dagger}U_{i,j}=\mathbb{I}_{A}. (54)
    • (b)

      Orthogonality: For all (i,j),(i,j)S(i,j),(i^{\prime},j^{\prime})\in S,

      Tr[Ki,jKi,j]=dArδi,iδj,j.\displaystyle\hskip 0.0pt\operatorname{Tr}\!\big[K_{i,j}^{\dagger}K_{i^{\prime},j^{\prime}}\big]=\frac{d_{A}}{r}\,\delta_{i,i^{\prime}}\delta_{j,j^{\prime}}. (55)
    • (c)

      Kraus rank: The number of Kraus operators is exactly r=|S|r=|S|.

  • Case 22: dA>dBd_{A}>d_{B}, let k=dAdB[1,r]k=\bigl\lfloor\frac{d_{A}}{d_{B}}\bigr\rfloor\in[1,r] and write dA=kdB+dCd_{A}=kd_{B}+d_{C} with 0dC<dB0\leq d_{C}<d_{B}. We can then decompose 𝟙A=𝟙B1𝟙Bk𝟙C\mathds{1}_{A}=\mathds{1}_{B_{1}}\oplus\cdots\oplus\mathds{1}_{B_{k}}\oplus\mathds{1}_{C}, where each BiBB_{i}\simeq B (i.e., dimBi=dB\dim B_{i}=d_{B}).

    For each block BiB_{i} (i=1,,ki=1,\dots,k), construct l=rk[1,dB2]l=\bigl\lfloor\frac{r}{k}\bigr\rfloor\in[1,d_{B}^{2}] orthogonal dB×dBd_{B}\times d_{B} unitary matrices {Ui,j}j[l]\{U_{i,j}\}_{j\in[l]} that are supported on BiB_{i} and define the corresponding dA×dBd_{A}\times d_{B} matrices

    Ki,j=(01lUi,j)(j[l]),\displaystyle\hskip 0.0ptK_{i,j}=\bigl(0\oplus\tfrac{1}{\sqrt{l}}\,U_{i,j}\bigr)\quad(j\in[l]), (56)

    where the direct sum is taken with respect to the decomposition dA(\slimits@i=1kdB)dC\mathbb{C}^{d_{A}}\simeq\bigl(\bigoplusop\slimits@_{i=1}^{k}\mathbb{C}^{d_{B}}\bigr)\oplus\mathbb{C}^{d_{C}} and Ui,jU_{i,j} acts nontrivially only on the ii-th dBd_{B}-dimensional summand.

    For the remaining block CC, since dC<dBd_{C}<d_{B} we can apply Case 11 and construct r=rdCdA[1,dCdB]r^{\prime}=\bigl\lceil\frac{rd_{C}}{d_{A}}\bigr\rceil\in[1,d_{C}d_{B}] orthogonal dC×dBd_{C}\times d_{B} isometries {Vi}i[r]\{V_{i^{\prime}}\}_{i^{\prime}\in[r^{\prime}]} and define

    Kk+1,i=(01rVi)(i[r]),\displaystyle\hskip 0.0ptK_{k+1,i^{\prime}}=\bigl(0\oplus\tfrac{1}{\sqrt{r^{\prime}}}\,V_{i^{\prime}}\bigr)\quad(i^{\prime}\in[r^{\prime}]), (57)

    where now ViV_{i^{\prime}} acts nontrivially only on the dC\mathbb{C}^{d_{C}} summand. We can check

    • (a)

      Completeness:

      \slimits@i=1k\slimits@j=1lKi,jKi,j+\slimits@i=1rKk+1,iKk+1,i\displaystyle\hskip 0.0pt\sumop\slimits@_{i=1}^{k}\sumop\slimits@_{j=1}^{l}K_{i,j}^{\dagger}K_{i,j}+\sumop\slimits@_{i^{\prime}=1}^{r^{\prime}}K_{k+1,i^{\prime}}^{\dagger}K_{k+1,i^{\prime}} =\slimits@i=1k\slimits@j=1l1l𝕀Bi+\slimits@i=1r1r𝕀C\displaystyle=\sumop\slimits@_{i=1}^{k}\sumop\slimits@_{j=1}^{l}\frac{1}{l}\,\mathbb{I}_{B_{i}}+\sumop\slimits@_{i^{\prime}=1}^{r^{\prime}}\frac{1}{r^{\prime}}\,\mathbb{I}_{C} (58)
      =𝕀A.\displaystyle=\mathbb{I}_{A}.
    • (b)

      Orthogonality: For all i,i[k]i,i^{\prime}\in[k] and j,j[l]j,j^{\prime}\in[l],

      Tr[Ki,jKi,j]\displaystyle\hskip 0.0pt\operatorname{Tr}\!\big[K_{i,j}^{\dagger}K_{i^{\prime},j^{\prime}}\big] =δi,iδj,jdBlδi,iδj,jdAr,\displaystyle=\delta_{i,i^{\prime}}\delta_{j,j^{\prime}}\frac{d_{B}}{l}\leq\delta_{i,i^{\prime}}\delta_{j,j^{\prime}}\frac{d_{A}}{r}, (59)

      and for i,i′′[r]i^{\prime},i^{\prime\prime}\in[r^{\prime}],

      Tr[Kk+1,iKk+1,i′′]=δi,i′′dCrδi,i′′dAr.\displaystyle\hskip 0.0pt\operatorname{Tr}\!\big[K_{k+1,i^{\prime}}^{\dagger}K_{k+1,i^{\prime\prime}}\big]=\delta_{i^{\prime},i^{\prime\prime}}\frac{d_{C}}{r^{\prime}}\leq\delta_{i^{\prime},i^{\prime\prime}}\frac{d_{A}}{r}. (60)
    • (c)

      Kraus rank: The total number of Kraus operators is

      lk+r=rkk+rdCdA2r.\displaystyle\hskip 0.0ptlk+r^{\prime}=\Bigl\lfloor\frac{r}{k}\Bigr\rfloor k+\Bigl\lceil\frac{rd_{C}}{d_{A}}\Bigr\rceil\leq 2r. (61)

Appendix C Proof of Lemma 4

Let us start by upper bounding

\|C\|1\displaystyle\hskip 0.0pt\|C\|_{1} (i)\|V~0(V~xV~y)AA\|1\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(i)}}}}{{\leq}}\|\mathaccent 869{V}_{0}{}_{A^{\prime}A}(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}^{\dagger})\|_{1} (62)
=(ii)\|(V~xV~y)AA\|1\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(ii)}}}}{{=}}\|{}_{A^{\prime}A}(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}^{\dagger})\|_{1}
=|AA(V~xV~y)(V~xV~y)|AA\displaystyle=\sqrt{\bra{{}_{A^{\prime}A}}(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}^{\dagger})(\mathaccent 869{V}_{x}-\mathaccent 869{V}_{y})\ket{{}_{A^{\prime}A}}}
=(iii)1dATr|V~xV~y|2\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(iii)}}}}{{=}}\sqrt{\frac{1}{d_{A}}\operatorname{Tr}|\mathaccent 869{V}_{x}-\mathaccent 869{V}_{y}|^{2}}
=1dA\|V~xV~y\|2\displaystyle=\frac{1}{\sqrt{d_{A}}}\|\mathaccent 869{V}_{x}-\mathaccent 869{V}_{y}\|_{2}
(iv)1dA\|UxUy\|2,\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(iv)}}}}{{\leq}}\frac{1}{\sqrt{d_{A}}}\|U_{x}-U_{y}\|_{2},

where in (i) we have used the data-processing inequality in a similar way to (30), in (ii) we have leveraged the variational characterisation of the 1-norm \|\|1=max𝟙X𝟙Tr[X]\|\,\cdot\,\|_{1}=\max_{-\mathds{1}\leq X\leq\mathds{1}}\operatorname{Tr}[X\,\cdot\,] and we have absorbed Vtilde1\tilde{V}_{1} in XX, in (iii) we have recalled the identity

|AA(V~xV~y)(V~xV~y)|AA=1dATr[(V~xV~y)(V~xV~y)];\displaystyle\hskip 0.0pt\bra{{}_{A^{\prime}A}}(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}^{\dagger})(\mathaccent 869{V}_{x}-\mathaccent 869{V}_{y})\ket{{}_{A^{\prime}A}}=\frac{1}{d_{A}}\operatorname{Tr}\left[(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}^{\dagger})(\mathaccent 869{V}_{x}-\mathaccent 869{V}_{y})\right]; (63)

finally, in (iv) we have noticed that

\|V~xV~y\|22\displaystyle\hskip 0.0pt\|\mathaccent 869{V}_{x}-\mathaccent 869{V}_{y}\|_{2}^{2} =Tr[S(UxUy)(UxUy)S]\displaystyle=\operatorname{Tr}\big[S^{\dagger}(U_{x}-U_{y})^{\dagger}(U_{x}-U_{y})S\big] (64)
=Tr[SS(UxUy)(UxUy)]\|UxUy\|22,\displaystyle=\operatorname{Tr}\big[SS^{\dagger}(U_{x}-U_{y})^{\dagger}(U_{x}-U_{y})\big]\leq\|U_{x}-U_{y}\|_{2}^{2},

as SS𝟙BESS^{\dagger}\leq\mathds{1}_{BE}.

Calling f(Ux,Uy)\|C\|1f(U_{x},U_{y})\coloneqq\|C\|_{1}, we have

|f(Ux,Uy)f(Ux,Uy)|\displaystyle|f(U_{x},U_{y})-f(U^{\prime}_{x},U^{\prime}_{y})| (65)
=|\|TrE[V~0||AA(V~xV~y)]\|1\|TrE[V~0||AA(V~xV~y)]\|1|\displaystyle\quad=\left|\;\left\|\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{0}\ket{\Psi}\!\!\bra{\Psi}_{A^{\prime}A}(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}^{\dagger})\right]\right\|_{1}-\left\|\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{0}\ket{\Psi}\!\!\bra{\Psi}_{A^{\prime}A}(\mathaccent 869{V}_{x}^{{}^{\prime}\dagger}-\mathaccent 869{V}_{y}^{{}^{\prime}\dagger})\right]\right\|_{1}\;\right|
=(v)\|TrE[V~0||AA(V~xV~y)]TrE[V~0||AA(V~xV~y)]\|1\displaystyle\quad\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(v)}}}}{{=}}\left\|\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{0}\ket{\Psi}\!\!\bra{\Psi}_{A^{\prime}A}(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{y}^{\dagger})\right]-\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{0}\ket{\Psi}\!\!\bra{\Psi}_{A^{\prime}A}(\mathaccent 869{V}_{x}^{{}^{\prime}\dagger}-\mathaccent 869{V}_{y}^{{}^{\prime}\dagger})\right]\right\|_{1}
(vi)\|TrE[V~0||AA(V~xV~x)]\|1+\|TrE[V~0||AA(V~yV~y)]\|1\displaystyle\quad\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(vi)}}}}{{\leq}}\left\|\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{0}\ket{\Psi}\!\!\bra{\Psi}_{A^{\prime}A}(\mathaccent 869{V}_{x}^{\dagger}-\mathaccent 869{V}_{x}^{{}^{\prime}\dagger})\right]\right\|_{1}+\left\|\mathrm{Tr}_{E}\left[\mathaccent 869{V}_{0}\ket{\Psi}\!\!\bra{\Psi}_{A^{\prime}A}(\mathaccent 869{V}_{y}^{{}^{\prime}\dagger}-\mathaccent 869{V}_{y}^{\dagger})\right]\right\|_{1}
(vii)1dA\|UxUx\|2+1dA\|UyUy\|2\displaystyle\quad\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(vii)}}}}{{\leq}}\sqrt{\frac{1}{d_{A}}}\|{U}_{x}-{U}_{x}^{\prime}\|_{2}+\sqrt{\frac{1}{d_{A}}}\|{U}_{y}-{U}_{y}^{\prime}\|_{2}
(viii)2dA\|UxUx\|22+\|UyUy\|22\displaystyle\quad\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(viii)}}}}{{\leq}}\sqrt{\frac{2}{d_{A}}}\sqrt{\|{U}_{x}-{U}_{x}^{\prime}\|_{2}^{2}+\|{U}_{y}-{U}_{y}^{\prime}\|_{2}^{2}}
=2dA\|(Ux,Uy)(Ux,Uy)\|2,\displaystyle\quad=\sqrt{\frac{2}{d_{A}}}\|(U_{x},U_{y})-(U_{x}^{\prime},U_{y}^{\prime})\|_{2},

where in (v) we have used the reverse triangle inequality, in (vi) we have leveraged the triangle inequality, in (vii) we have bounded as in (62), and in (viii) we have recalled the inequality |a|+|b|2(a2+b2)|a|+|b|\leq\sqrt{2(a^{2}+b^{2})}. This completes the first part of the proof of Lemma 4. Now, we want to prove that

𝔼Tr[|C|2]\displaystyle\mathbb{E}\operatorname{Tr}\big[|C|^{2}\big] =2r.\displaystyle=\frac{2}{r}. (66)

when we sample independent random unitaries Ux,UyHaar(U(rdB))U_{x},U_{y}\sim{\rm Haar}({\rm U}(rd_{B})). Let {|iE}i[r]\{\ket{i}_{E}\}_{i\in[r]} be an orthonormal basis for EE. For i=1,,ri=1,\dots,r, let K0,iABi|EV~0ABEK_{0,i}^{A\to B}\coloneqq\bra{i}_{E}\mathaccent 869{V}_{0}^{A\to BE} and Kx,iABi|EV~xABEK_{x,i}^{A\to B}\coloneqq\bra{i}_{E}\mathaccent 869{V}_{x}^{A\to BE} be the Kraus operators obtained from the isometries V~0\mathaccent 869{V}_{0} and V~x\mathaccent 869{V}_{x}, respectively. Writing the trace on the system EE in terms of the basis {|iE}i[r]\{\ket{i}_{E}\}_{i\in[r]}, we get

𝔼Tr[|C|2]\displaystyle\hskip 0.0pt\mathbb{E}\operatorname{Tr}\big[|C|^{2}\big] =𝔼Tr[\slimits@i,j=1rK0,i(Kx,iKy,i)AA(Kx,jKy,j)K0,jAA]\displaystyle=\mathbb{E}\operatorname{Tr}\left[\sumop\slimits@_{i,j=1}^{r}K_{0,i}{}_{A^{\prime}A}(K_{x,i}-K_{y,i})^{\dagger}(K_{x,j}-K_{y,j}){}_{A^{\prime}A}K_{0,j}^{\dagger}\right] (67)
=(viii)\slimits@i,j=1rTr[K0,i(2𝟙rδi,j)AAK0,jAA]\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(viii)}}}}{{=}}\sumop\slimits@_{i,j=1}^{r}\operatorname{Tr}\left[K_{0,i}{}_{A^{\prime}A}\left(\frac{2\mathds{1}}{r}\delta_{i,j}\right){}_{A^{\prime}A}K_{0,j}^{\dagger}\right]
=2r\slimits@i=1rTr[K0,iK0,iAA2]\displaystyle=\frac{2}{r}\sumop\slimits@_{i=1}^{r}\operatorname{Tr}\left[K_{0,i}{}_{A^{\prime}A}^{2}K_{0,i}^{\dagger}\right]
=2rTr[\slimits@i=1rAAK0,iK0,i]\displaystyle=\frac{2}{r}\operatorname{Tr}\left[{}_{A^{\prime}A}\sumop\slimits@_{i=1}^{r}K_{0,i}^{\dagger}K_{0,i}\right]
=2r,\displaystyle=\frac{2}{r},

where in (viii) we have expanded

𝔼[(Kx,iKy,i)(Kx,jKy,j)]\displaystyle\hskip 0.0pt\mathbb{E}\left[(K_{x,i}-K_{y,i})^{\dagger}(K_{x,j}-K_{y,j})\right] =𝔼[Kx,iKx,j]+𝔼[Ky,iKy,j]\displaystyle=\mathbb{E}\left[K_{x,i}^{\dagger}K_{x,j}\right]+\mathbb{E}\left[K_{y,i}^{\dagger}K_{y,j}\right] (68)
𝔼[Kx,iKy,j]𝔼[Ky,iKx,j]\displaystyle\quad-\mathbb{E}\left[K_{x,i}^{\dagger}K_{y,j}\right]-\mathbb{E}\left[K_{y,i}^{\dagger}K_{x,j}\right]

and, for z1,z2{x,y}z_{1},z_{2}\in\{x,y\}, we have computed

𝔼[Kz1,iKz2,j]\displaystyle\hskip 0.0pt\mathbb{E}\left[K_{{z_{1}},i}^{\dagger}K_{{z_{2}},j}\right] =𝔼[V~z1|iEj|EV~z2]=S𝔼[Uz1|iEj|EUz2]S\displaystyle=\mathbb{E}\left[\mathaccent 869{V}_{z_{1}}^{\dagger}\ket{i}_{E}\bra{j}_{E}\mathaccent 869{V}_{z_{2}}\right]=S^{\dagger}\mathbb{E}\left[U_{z_{1}}^{\dagger}\ket{i}_{E}\bra{j}_{E}U_{z_{2}}\right]S (69)
=S(δz1,z2rdBTr[|iEj|E𝟙B])S=δz1,z2δi,jr𝟙A,\displaystyle=S^{\dagger}\left(\frac{\delta_{{z_{1}},{z_{2}}}}{rd_{B}}\operatorname{Tr}\left[\ket{i}_{E}\bra{j}_{E}\otimes\mathds{1}_{B}\right]\right)S=\frac{\delta_{{z_{1}},{z_{2}}}\delta_{i,j}}{r}\mathds{1}_{A},

leveraging the fact that 𝔼UU(d)[U]=0\underset{\mathchoice{\scalebox{0.8}{$\displaystyle U\in{\rm U}(d)$}}{\scalebox{0.8}{$\textstyle U\in{\rm U}(d)$}}{\scalebox{0.8}{$\scriptstyle U\in{\rm U}(d)$}}{\scalebox{0.8}{$\scriptscriptstyle U\in{\rm U}(d)$}}}{\mathds{E}\,}[U]=0, 𝔼UU(d)[UXU]=Tr[X]d𝟙\underset{\mathchoice{\scalebox{0.8}{$\displaystyle U\in{\rm U}(d)$}}{\scalebox{0.8}{$\textstyle U\in{\rm U}(d)$}}{\scalebox{0.8}{$\scriptstyle U\in{\rm U}(d)$}}{\scalebox{0.8}{$\scriptscriptstyle U\in{\rm U}(d)$}}}{\mathds{E}\,}[U^{\dagger}XU]=\frac{\operatorname{Tr}[X]}{d}\mathds{1} and SS=𝟙AS^{\dagger}S=\mathds{1}_{A}.

The only inequality we are left to prove is

𝔼Tr[|C|4]128r3.\displaystyle\hskip 0.0pt\mathbb{E}\operatorname{Tr}\big[|C|^{4}\big]\leq\frac{128}{r^{3}}. (70)

We have

𝔼Tr[|C|4]\displaystyle\hskip 0.0pt\mathbb{E}\operatorname{Tr}\big[|C|^{4}\big] =𝔼Tr[\slimits@i,j=1rK0,i(Kx,iKy,i)AA(Kx,jKy,j)K0,jAA\displaystyle=\mathbb{E}\operatorname{Tr}\Bigg[\sumop\slimits@_{i,j=1}^{r}K_{0,i}{}_{A^{\prime}A}(K_{x,i}-K_{y,i})^{\dagger}(K_{x,j}-K_{y,j}){}_{A^{\prime}A}K_{0,j}^{\dagger} (71)
×\slimits@k,l=1rK1,k(Kx,kKy,k)AA(Kx,lKy,l)K0,lAA]\displaystyle\qquad\qquad\qquad\qquad\times\sumop\slimits@_{k,l=1}^{r}K_{1,k}{}_{A^{\prime}A}(K_{x,k}-K_{y,k})^{\dagger}(K_{x,l}-K_{y,l}){}_{A^{\prime}A}K_{0,l}^{\dagger}\Bigg]
=𝔼\slimits@i,j=1r\slimits@k,l=1r|AAK0,lK0,i|AA|AAK0,jK0,k|AA\displaystyle=\mathbb{E}\sumop\slimits@_{i,j=1}^{r}\sumop\slimits@_{k,l=1}^{r}\bra{{}_{A^{\prime}A}}K_{0,l}^{\dagger}K_{0,i}\ket{{}_{A^{\prime}A}}\bra{{}_{A^{\prime}A}}K_{0,j}^{\dagger}K_{0,k}\ket{{}_{A^{\prime}A}}
×|AA(Kx,iKy,i)(Kx,jKy,j)|AA\displaystyle\phantom{\mathbb{E}\sumop\slimits@_{i,j=1}^{r}\sumop\slimits@_{k,l=1}^{r}}\quad\times\bra{{}_{A^{\prime}A}}(K_{x,i}-K_{y,i})^{\dagger}(K_{x,j}-K_{y,j})\ket{{}_{A^{\prime}A}}
×|AA(Kx,kKy,k)(Kx,lKy,l)|AA\displaystyle\phantom{\mathbb{E}\sumop\slimits@_{i,j=1}^{r}\sumop\slimits@_{k,l=1}^{r}}\quad\times\bra{{}_{A^{\prime}A}}(K_{x,k}-K_{y,k})^{\dagger}(K_{x,l}-K_{y,l})\ket{{}_{A^{\prime}A}}
(ix)𝔼\slimits@i,j=1r4r2||AA(Kx,iKy,i)(Kx,jKy,j)|AA|2\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(ix)}}}}{{\leq}}\mathbb{E}\sumop\slimits@_{i,j=1}^{r}\frac{4}{r^{2}}\left|\bra{{}_{A^{\prime}A}}(K_{x,i}-K_{y,i})^{\dagger}(K_{x,j}-K_{y,j})\ket{{}_{A^{\prime}A}}\right|^{2}

where in (ix) we have noticed that, by (24),

||AAK0,lK0,i|AA|AAK0,jK0,k|AA|\displaystyle\big|\bra{{}_{A^{\prime}A}}K_{0,l}^{\dagger}K_{0,i}\ket{{}_{A^{\prime}A}}\bra{{}_{A^{\prime}A}}K_{0,j}^{\dagger}K_{0,k}\ket{{}_{A^{\prime}A}}\big| (72)
=|1dATr[K0,lK0,i]1dATr[K0,jK0,k]|4r2δi,lδj,k.\displaystyle\qquad\qquad=\left|\frac{1}{d_{A}}\operatorname{Tr}[K_{0,l}^{\dagger}K_{0,i}]\frac{1}{d_{A}}\operatorname{Tr}[K_{0,j}^{\dagger}K_{0,k}]\right|\leq\frac{4}{r^{2}}\delta_{i,l}\delta_{j,k}.

Hence

𝔼Tr[|C|4]\displaystyle\hskip 0.0pt\mathbb{E}\operatorname{Tr}\big[|C|^{4}\big] 𝔼\slimits@i,j=1r4r2||AA(Kx,iKy,i)(Kx,jKy,j)|AA|2\displaystyle\leq\mathbb{E}\sumop\slimits@_{i,j=1}^{r}\frac{4}{r^{2}}\left|\bra{{}_{A^{\prime}A}}(K_{x,i}-K_{y,i})^{\dagger}(K_{x,j}-K_{y,j})\ket{{}_{A^{\prime}A}}\right|^{2} (73)
(x)𝔼\slimits@i,j=1r16r2(||AAKx,iKx,j|AA|2+||AAKy,iKy,j|AA|2\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(x)}}}}{{\leq}}\mathbb{E}\sumop\slimits@_{i,j=1}^{r}\frac{16}{r^{2}}\Big(\left|\bra{{}_{A^{\prime}A}}K_{x,i}^{\dagger}K_{x,j}\ket{{}_{A^{\prime}A}}\right|^{2}+\left|\bra{{}_{A^{\prime}A}}K_{y,i}^{\dagger}K_{y,j}\ket{{}_{A^{\prime}A}}\right|^{2}
+2||AAKx,iKy,j|AA|2)\displaystyle\phantom{\leq\mathbb{E}\sumop\slimits@_{i,j=1}^{r}\frac{16}{r^{2}}}\quad+2\left|\bra{{}_{A^{\prime}A}}K_{x,i}^{\dagger}K_{y,j}\ket{{}_{A^{\prime}A}}\right|^{2}\Big)
(xi)𝔼\slimits@i,j=1r32r2(||AAKx,iKx,j|AA|2+||AAKy,iKy,j|AA|2)\displaystyle\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(xi)}}}}{{\leq}}\mathbb{E}\sumop\slimits@_{i,j=1}^{r}\frac{32}{r^{2}}\left(\left|\bra{{}_{A^{\prime}A}}K_{x,i}^{\dagger}K_{x,j}\ket{{}_{A^{\prime}A}}\right|^{2}+\left|\bra{{}_{A^{\prime}A}}K_{y,i}^{\dagger}K_{y,j}\ket{{}_{A^{\prime}A}}\right|^{2}\right)
=64r2\slimits@i,j=1r𝔼||AAKx,iKx,j|AA|2,\displaystyle=\frac{64}{r^{2}}\sumop\slimits@_{i,j=1}^{r}\mathbb{E}\left|\bra{{}_{A^{\prime}A}}K_{x,i}^{\dagger}K_{x,j}\ket{{}_{A^{\prime}A}}\right|^{2},

where in (x) and in (xi) we have leveraged the inequality 2aba2+b22ab\leq a^{2}+b^{2} multiple times.

Recalling that we defined V~x=UxS\mathaccent 869{V}_{x}=U_{x}S and Kx,i=i|EV~xK_{x,i}=\bra{i}_{E}\mathaccent 869{V}_{x} , we compute

𝔼||AAKx,iKx,j|AA|2\displaystyle\mathbb{E}\left|\bra{{}_{A^{\prime}A}}K_{x,i}^{\dagger}K_{x,j}\ket{{}_{A^{\prime}A}}\right|^{2} (74)
=1dA2𝔼[Tr[Kx,iKx,j]Tr[Kx,jKx,i]]\displaystyle\quad=\frac{1}{d_{A}^{2}}\mathbb{E}\left[\operatorname{Tr}[K_{x,i}^{\dagger}K_{x,j}]\operatorname{Tr}[K_{x,j}^{\dagger}K_{x,i}]\right]
=1dA2\slimits@k,l=1dA𝔼Tr[Kx,iKx,j|lk|Kx,jKx,i|kl|]\displaystyle\quad=\frac{1}{d_{A}^{2}}\sumop\slimits@_{k,l=1}^{d_{A}}\mathbb{E}\operatorname{Tr}\left[K_{x,i}^{\dagger}K_{x,j}\ket{l}\bra{k}K_{x,j}^{\dagger}K_{x,i}\ket{k}\bra{l}\right]
=(xii)1dA2\slimits@k,l=1dA𝔼Tr[Ux(|iEj|E𝟙B)UxS|lk|SUx(|jEi|E𝟙B)UxS|kl|S]\displaystyle\quad\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(xii)}}}}{{=}}\frac{1}{d_{A}^{2}}\sumop\slimits@_{k,l=1}^{d_{A}}\mathbb{E}\operatorname{Tr}\left[U_{x}^{\dagger}\big(\ket{i}_{E}\bra{j}_{E}\otimes\mathds{1}_{B}\big)U_{x}S\ket{l}\bra{k}S^{\dagger}U_{x}^{\dagger}\big(\ket{j}_{E}\bra{i}_{E}\otimes\mathds{1}_{B}\big)U_{x}S\ket{k}\bra{l}S^{\dagger}\right]
=(xiii)1dA2\slimits@k,l=1dA\slimits@α,βS2Wg(βα,dBr)Trβ[S|kl|S,S|lk|S]\displaystyle\quad\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(xiii)}}}}{{=}}\frac{1}{d_{A}^{2}}\sumop\slimits@_{k,l=1}^{d_{A}}\sumop\slimits@_{\alpha,\beta\in S_{2}}\operatorname{Wg}(\beta\alpha,d_{B}r)\mathrm{Tr}_{\beta}\left[S\ket{k}\bra{l}S^{\dagger},S\ket{l}\bra{k}S^{\dagger}\right]
×Trα(12)[|iEj|E𝟙B,|jEi|E𝟙B]\displaystyle\quad\phantom{\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(xii)}}}}{{=}}\sumop\slimits@_{\alpha,\beta\in S_{2}}\operatorname{Wg}(\beta\alpha,d_{B}r)}\quad\times\mathrm{Tr}_{\alpha(12)}\left[\ket{i}_{E}\bra{j}_{E}\otimes\mathds{1}_{B},\ket{j}_{E}\bra{i}_{E}\otimes\mathds{1}_{B}\right]
=1dA2\slimits@k,l=1dA(Wg((1)(2),dBr)(δk,ldB+δi,jdB2)+Wg((12),dBr)(δk,lδi,jdB2+dB))\displaystyle\quad=\frac{1}{d_{A}^{2}}\sumop\slimits@_{k,l=1}^{d_{A}}\Big(\operatorname{Wg}((1)(2),d_{B}r)\big(\delta_{k,l}d_{B}+\delta_{i,j}d_{B}^{2}\big)+\operatorname{Wg}((2),d_{B}r)\big(\delta_{k,l}\delta_{i,j}d_{B}^{2}+d_{B}\big)\Big)
=(xiv)1dA21(dBr)21\slimits@k,l=1dA(δk,ldB+δi,jdB21dBr(δk,lδi,jdB2+dB))\displaystyle\quad\stackrel{{\scriptstyle\mathclap{\scriptsize\mbox{(xiv)}}}}{{=}}\frac{1}{d_{A}^{2}}\cdot\frac{1}{(d_{B}r)^{2}-1}\sumop\slimits@_{k,l=1}^{d_{A}}\Big(\delta_{k,l}d_{B}+\delta_{i,j}d_{B}^{2}-\frac{1}{d_{B}r}\big(\delta_{k,l}\delta_{i,j}d_{B}^{2}+d_{B}\big)\Big)
=1dA1(rdB)21(dB+δi,jdAdB21dBr(δi,jdB2+dAdB))\displaystyle\quad=\frac{1}{d_{A}}\cdot\frac{1}{(rd_{B})^{2}-1}\Big(d_{B}+\delta_{i,j}d_{A}d_{B}^{2}-\frac{1}{d_{B}r}\big(\delta_{i,j}d_{B}^{2}+d_{A}d_{B}\big)\Big)

where in (xii) have expanded Kx,i=i|EUxSK_{x,i}=\bra{i}_{E}U_{x}S and we have leveraged the ciclicity of the trace; in (xiii) we have used Lemma 6 with A1=|iEj|E𝟙BA_{1}=\ket{i}_{E}\bra{j}_{E}\otimes\mathds{1}_{B}, A2=|jEi|E𝟙BA_{2}=\ket{j}_{E}\bra{i}_{E}\otimes\mathds{1}_{B}, B1=S|kl|SB_{1}=S\ket{k}\bra{l}S^{\dagger} and B2=S|lk|SB_{2}=S\ket{l}\bra{k}S^{\dagger}; in (xiv) we have used the values given in Lemma 7. Combining (73) with (LABEL:eq:73), we get

𝔼Tr[|C|4]\displaystyle\hskip 0.0pt\mathbb{E}\operatorname{Tr}\big[|C|^{4}\big] 64r2\slimits@i,j=1r𝔼||AAKx,iKx,j|AA|2\displaystyle\leq\frac{64}{r^{2}}\sumop\slimits@_{i,j=1}^{r}\mathbb{E}\left|\bra{{}_{A^{\prime}A}}K_{x,i}^{\dagger}K_{x,j}\ket{{}_{A^{\prime}A}}\right|^{2} (75)
64r21dA1(rdB)21\slimits@i,j=1r(dB+δi,jdAdB21dBr(δi,jdB2+dAdB))\displaystyle\leq\frac{64}{r^{2}}\cdot\frac{1}{d_{A}}\cdot\frac{1}{(rd_{B})^{2}-1}\sumop\slimits@_{i,j=1}^{r}\Big(d_{B}+\delta_{i,j}d_{A}d_{B}^{2}-\frac{1}{d_{B}r}\big(\delta_{i,j}d_{B}^{2}+d_{A}d_{B}\big)\Big)
=64r21dAdB1(rdB)21(dB2r2+rdAdB3dB2dAdBr)\displaystyle=\frac{64}{r^{2}}\cdot\frac{1}{d_{A}d_{B}}\cdot\frac{1}{(rd_{B})^{2}-1}\left(d_{B}^{2}r^{2}+rd_{A}d_{B}^{3}-d_{B}^{2}-d_{A}d_{B}r\right)
=64r2(1dAdB+1r+1dB2+dAdBdAdBrdAdB(r2dB21))128r3,\displaystyle=\frac{64}{r^{2}}\cdot\left(\frac{1}{d_{A}d_{B}}+\frac{1}{r}+\frac{1-d_{B}^{2}+d_{A}d_{B}-d_{A}d_{B}r}{d_{A}d_{B}(r^{2}d_{B}^{2}-1)}\right)\leq\frac{128}{r^{3}},

where in the last line we have recalled that rdAdBr\leq d_{A}d_{B}. This concludes the proof.

Appendix D Weingarten Calculus

As we use a random channel constructed from sampling a Haar{\rm Haar}-random unitary matrix in our lower bound proofs, we need some facts from Weingarten calculus in order to compute the corresponding expectation values with respect to the Haar measure. If πSn\pi\in S_{n} is a permutation of [n][n], let Wg(π,d)\operatorname{Wg}(\pi,d) denote the Weingarten function of dimension dd. The following lemma is useful for our results.

Lemma 6 ([gu2013moments]).

Let UU be a Haar{\rm Haar}-distributed unitary (d×d)(d\times d)-matrix and let {Ai,Bi}i=1n\{A_{i},B_{i}\}_{i=1}^{n} be a sequence of complex (d×d)(d\times d)-matrices. We have the following formula for the expectation value:

𝔼[Tr(UB1UA1UUBnUAn)]\displaystyle\mathbb{E}\left[\operatorname{Tr}(UB_{1}U^{\dagger}A_{1}U\dots UB_{n}U^{\dagger}A_{n})\right] (76)
=\slimits@α,βSnWg(βα1,d)Trβ1(B1,,Bn)Trαγn(A1,,An),\displaystyle\qquad=\sumop\slimits@_{\alpha,\beta\in S_{n}}\operatorname{Wg}(\beta\alpha^{-1},d)\operatorname{Tr}_{\beta^{-1}}(B_{1},\dots,B_{n})\operatorname{Tr}_{\alpha\gamma_{n}}(A_{1},\dots,A_{n}),

where γn=(12n)\gamma_{n}=(12\dots n) and, writing σ\sigma in terms of cycles {Cj}\{C_{j}\} as σ=\slimits@jCj\sigma=\prodop\slimits@_{j}C_{j},

Trσ(M1,,Mn)\slimits@jTr\slimits@iCjMi.\displaystyle\hskip 0.0pt\operatorname{Tr}_{\sigma}(M_{1},\dots,M_{n})\coloneqq\prodop\slimits@_{j}\operatorname{Tr}\prodop\slimits@_{i\in C_{j}}M_{i}. (77)

We will also need some values of Weingarten function.

Lemma 7 ([collins2006integration]).

The function Wg(π,d)\operatorname{Wg}(\pi,d) has the following values:

  • Wg((1),d)=1d\operatorname{Wg}((1),d)=\frac{1}{d},

  • Wg((12),d)=1d(d21)\operatorname{Wg}((12),d)=\frac{-1}{d(d^{2}-1)},

  • Wg((1)(2),d)=1d21\operatorname{Wg}((1)(2),d)=\frac{1}{d^{2}-1}.