Improved Lower Bounds for Learning Quantum Channels in Diamond Distance
Abstract
We prove that learning an unknown quantum channel with input dimension , output dimension , and Choi rank to diamond distance requires queries. This improves the best previous bound by introducing explicit -dependence, with a scaling in that is near-optimal when but not tight in general. The proof constructs an ensemble of channels that are well-separated in diamond norm yet admit Stinespring isometries that are close in operator norm.
1 Introduction
In [AMele2025, chen2025quantumchanneltomographyestimation] it is proved that there exists a quantum learning algorithm that uses
| (1) |
parallel queries of any unknown channel with input dimension , output dimension , and Choi rank and, with probability at least , outputs a classical description of a channel which is distant at most from in diamond distance. Moreover, in [Girardi2025Dec] it is proved that any quantum algorithm learning up to constant error with success probability at least needs
| (2) |
queries of at least. The aim of our work is to improve the lower bound in order to make it dependent on the diamond distance . More precisely, our main result (see Theorem 3) identifies the new lower bound
| (3) |
This result, combined with the upper bound of [chen2025quantumchanneltomographyestimation], shows that the optimal dependency in the precision parameter is when . In particular, for learning unitary channels, our result recovers the optimal lower bound of [haah2023query] up to a logarithmic factor, via a different proof strategy that applies specifically in the coherent setting. Moreover, our approach generalizes to non-unitary channels.
When , channel learning becomes state learning, for which the optimal complexity is [ODonnell2016-1, haah2017sample]. This shows that the -dependency in our lower bound is not optimal in general.
The main idea to prove our lower bound consists of two steps. The first one is the proof of a general lower bound (Theorem 1), which leverage any arbitrary ensemble of channels that are pairwise far in diamond distance, yet whose Stinespring isometries are pairwise close in operator norm. The resulting lower bound then scales logarithmically with the size of the ensemble and inversely with the distance of the Stinespring isometries. The second step is the actual construction of a suitable ensemble of channels to prove our lower bound. A natural strategy to this end could be the use of existing packing nets. However, as we show in Appendix A, this approach – although simpler – yields a weaker bound:
| (4) |
Instead, using a probabilistic approach, we construct a random family of Stinespring isometries which, with positive probability, are sufficiently close in operator norm, but engender channels which are far enough in diamond distance. Such argument ensures the existence of an ensemble that produces the desired lower bound (Theorem 3).
The remainder of the manuscript is organised as follows. In Section 1 we introduce the notation and the definitions that we are going to use in the paper. In Section 2 we state and prove Theorem 1, i.e. the general lower bound on channel learning constructed in terms of ensembles of quantum channels. In Section 3 we prove the lower bound (3) leveraging a family of particular isometries (Lemma 2) in order to construct a suitable ensemble of channels to be used in Theorem 1 (see Theorem 3). The isometries of Lemma 2 are identified using a random construction, which is discussed in Section 4. In the Appendix we provide the proofs that were deferred in the previous sections to improve readability.
1.1 Notation
All the Hilbert spaces that we are going to consider are supposed to be finite-dimensional. Let and denote input and output spaces. We write for linear operators on , and for quantum states (positive semi-definite operators with unit trace). The operator norm is , and the trace norm is . For , the von Neumann entropy is . All logarithms are in base .
1.2 Quantum channels and their representations
A quantum channel is a completely positive trace-preserving (CPTP) map. It can be written in the Kraus representation as with . The minimal is called the Kraus rank.
For any channel , there exists an isometry () such that . Such isometry is called Stinespring dilation of . The minimal dimension of equals the Kraus rank.
The Choi state of the channel is
| (5) |
where is the normalised maximally entangled state between and . The linear map is CPTP if and only if and .
The Choi rank equals the Kraus rank and the minimal environment dimension of Stinespring dilations.
1.3 Channel ensembles with distance constraints
Let denote the set of quantum channels with Choi rank at most , i.e.
| (6) |
For a channel , let be a Stinespring isometry with minimal environment dimension .
For channels , the diamond distance is defined as
| (7) |
where the supremum is over all auxiliary spaces and states .
We say that two channels are -diamond far if
| (8) |
We say that their Stinespring isometries are -operator norm close if there exist choices of isometries such that
| (9) |
Finally, we define as the set of ensembles of channels that are pairwise -diamond-far but have -close Stinespring isometries:
| (10) |
1.4 The coherent query model
In the coherent query model for quantum channel learning, the learning algorithm is allowed to interleave queries to the unknown channel with arbitrary, adaptively chosen quantum operations. Formally, the algorithm prepares an initial quantum state on a system comprising the input space of dimension together with an auxiliary system of arbitrary dimension. It then performs uses of the unknown channel , interspersed with arbitrary quantum channels (the intermediate operations) that act jointly on the output space and the auxiliary system. The final state after queries is
| (11) |
where each denotes a query to the unknown channel acting on the input system while leaving the auxiliary system unchanged. Finally, the algorithm measures with a positive operator-valued measure (POVM) to produce a classical description of an estimate . The query complexity is the minimum number of uses of required to output, with high probability, an estimate such that .
This model generalizes the parallel (or non-adaptive) query model, in which all uses of are applied in parallel on a (possibly entangled) input state, corresponding to the special case where the intermediate operations are all identity channels. The coherent model captures the most general physically realizable learning procedure that respects causality and does not assume access to the inverse or conjugate of . It is the natural setting for studying the fundamental quantum limits of channel learning when arbitrary quantum processing between queries is allowed.
2 A general lower bound on channel learning
In this section, we prove a general lower bound for learning a general quantum channel in diamond distance.
Theorem 1 ((General lower bound)).
Let , and . Consider an ensemble of quantum channels that are -diamond-far and whose Stinespring isometries are -operator-norm-close. Any coherent algorithm that constructs such that with probability at least for all needs at least
| (12) |
uses of .
We follow a standard strategy for proving lower bounds for learning problems (e.g., [flammia2012quantum, haah2017sample, lowe2022lower, fawzi2023lower, oufkir2023sample, Bluhm2024Mar, Rosenthal2024Sep, Mele_2025]).
Proof.
Let us consider any fixed coherent algorithm that constructs such that with probability at least for all . Let and let be the index output by such algorithm upon receiving the quantum channel . Since the quantum channels are pairwise -diamond-far, the algorithm should find just by picking the -closest channel to in (see Figure 1). Hence, by Fano’s inequality [FANO] we have:
| (13) |
A coherent algorithm using the quantum channel chooses the input state , the channels , and measures the output state:
| (14) |
We can suppose that the channels act on different systems of dimension (we can include swap channels in if necessary). We can assume, without loss of generality, that all the channels are isometries up to modifying the measurement at the end. Similarly, we can suppose that is applied instead of x directly after for . The global system is thus where and is an ancilla system of arbitrary dimension. The global state before measurement becomes
| (15) |
with . For , we denote , and , so that we have
| (16) |
Denote by and . Hence the mutual information between and the observation of the coherent algorithm can be bounded as follows
| (17) | ||||
where (i) uses Holevo’s theorem [holevo1973bounds] ; (ii) is a telescopic sum and uses the fact that for all as all the applied operations are isometry; (iii) uses the assumption that and are isometry channels.
Now, observe that we have that so we can apply the continuity bound of [Berta2024Aug, Theorem 5]
| (18) | ||||
with being the binary entropy. We have that
| (19) | ||||
with being a quantum state. Using the triangle inequality, we obtain
| (20) | ||||
Therefore, we deduce
| (21) | ||||
where we used that for . Since we deduce that
| (22) |
This concludes the proof. ∎
Given Theorem 1, we can prove lower bounds on learning quantum channels by constructing an ensemble within containing quantum channels that are pairwise -diamond-far, yet whose Stinespring isometries are pairwise -operator-norm-close. The resulting lower bound then scales with and inversely with . To strengthen this bound, we should aim to construct an ensemble that maximizes while minimizing 111Note that by the inequality [kretschmann2008information], the parameter should be at least ..
A natural approach to constructing such an ensemble is to use existing packing nets. However, this leads to an ensemble in of cardinality satisfying, and by Theorem 1 implies the following weak lower bound (see Appendix A for details):
| (23) |
In what follows, we improve the -dependence of this lower bound by constructing a new ensemble in with comparable cardinality but with rather than .
3 An ensemble yielding an improved lower bound
In this section, we improve the bound (23) by constructing an ensemble with cardinality satisfying . More precisely, we construct a set of isometries corresponding to quantum channels such that , , and , as in Figure 2.
We prove the existence of such a set using a probabilistic argument. Let be a quantum channel with Kraus operators satisfying
| (24) |
The existence of such a channel is shown in Appendix B. Let be a Stinespring isometry of the quantum channel .
Lemma 2.
There exists a set of isometries such that and, for all , we have
| (25) |
where is the maximally entangled state with .
Proof.
The proof of this lemma is deferred to Section 4. ∎
We now have all the ingredients to prove the main result.
Proof.
Given the set of isometries provided by Lemma 2, we define the isometries
| (27) |
and let the corresponding quantum channel. It has Kraus rank at most . The quantum channel x has input system of dimension and output system of dimension . We have that
| (28) |
On the other hand, lower bounding the diamond norm by choosing the input state to be the maximally entangled state , we have
| (29) | ||||
where in (i) we have used the reverse triangle inequality and the bound
| (30) |
which holds for every operator and follows from the data-processing inequality for the trace norm; in (ii) we have noticed that
| (31) | ||||
in (iii) we have upper bounded ; finally, in (iv) we used .
To sum up, we showed the existence of with cardinality satisfying . This implies the existence of with cardinality satisfying for . By Theorem 1, we conclude:
| (32) |
which completes the proof. ∎
4 Proof of Lemma 2
In order to provide the set of isometries claimed in the statement of Lemma 2, we are going use the following construction. Let us introduce
| (33) |
where is a unitary operator. In our construction, we will consider independent random unitaries sampled according to the Haar measure. We are going to leverage the following technical statements in the proof of Lemma 2.
Lemma 4.
Let and let be defined as in (33). Then, let us define the operator
| (34) |
where is the maximally entangled state between the systems and . The function is -Lipschitz with respect to the -sum of the 2-norms, namely
| (35) |
for all , where is the -sum of the 2-norms. Furthermore, if we consider independent random unitaries , we have
-
(a)
,
-
(b)
.
Proof.
See Appendix C. ∎
Lemma 5 ([meckes2013spectral, Corollary 17]).
Let . Suppose that is -Lipschitz with respect to the -sum of the 2-norms, i.e.
| (36) |
for all , with . Then, if we independently sample according to the Haar measure on , the following inequality holds for each :
| (37) |
Now we have all the ingredients to prove Lemma 2.
Proof of Lemma 2..
Let as in Lemma 4. By Hölder’s inequality applied to with conjugate exponents and , we get
| (38) |
which yields
| (39) |
Whence, by the bounds (a) and (b) of Lemma 4 combined with (39), we have
| (40) |
Furthermore, since the function is -Lipschitz, when we sample two independent unitaries , by Lemma 5, we have222To be precise, we are applying Lemma 5 to , which is also -Lipschitz.
| (41) |
Therefore, with probability at least , we have
| (42) |
where in (i) we have used the lower bound (40). Let
| (43) |
and let be i.i.d. Haar random matrices. Note that, by their very definitions, and . By the union bound, we have
5 Conclusion
We have proved that learning an unknown quantum channel to diamond distance requires queries, improving upon the previous bound. The proof constructs ensembles of channels that are well-separated in diamond norm yet admit Stinespring isometries that are close in operator norm.
Several natural questions remain open. First, the precise -dependence in the general case is still unclear: while our bound scales as , the state-learning regime suggests is necessary in some parameter ranges. Second, it is unknown whether coherent strategies offer any advantage over parallel strategies for channel learning in diamond distance. Finally, the role of quantum memory in the query complexity requires further exploration.
Acknowledgments
FG acknowledges financial support from the European Union (ERC StG ETQO, Grant Agreement no. 101165230).
References
Appendix A A weaker lower bound using existing packing nets
A natural approach to constructing such an ensemble in is to use packing nets. Assume that . From [Girardi2025Dec, Lemma 14], we have
| (46) |
where denotes the -packing number of the set with respect to the norm .
Let , and let be a -diamond-norm packing of quantum channels, with corresponding Stinespring isometries .
For a given and each , we define the convex mixture
| (47) |
This is a valid quantum channel of Choi rank at most . We observe that for any distinct ,
| (48) |
since by the packing property. Moreover, by [kretschmann2008information], we have
| (49) |
where is a Stinespring isometry for 1. Let be a Stinespring isometry for x achieving this infimum. Then for all ,
| (50) |
Thus, , which implies the lower bound from Theorem 1
| (51) |
Appendix B Existence of the quantum channel
In this section, we want to show the existence of a quantum channel with Kraus operators satisfying
| (52) |
We make cases depending on whether or not.
-
•
Case : , let and . Note that . We decompose , where .
For each block (), we can choose orthogonal unitary matrices (for example, a subset of the generalized Pauli operators). Since , we may select a subset with . For each , define the Kraus operator
(53) where the direct sum is taken with respect to the decomposition above, and acts nontrivially only on the -th summand.
We then verify:
-
(a)
Completeness:
(54) -
(b)
Orthogonality: For all ,
(55) -
(c)
Kraus rank: The number of Kraus operators is exactly .
-
(a)
-
•
Case : , let and write with . We can then decompose , where each (i.e., ).
For each block (), construct orthogonal unitary matrices that are supported on and define the corresponding matrices
(56) where the direct sum is taken with respect to the decomposition and acts nontrivially only on the -th -dimensional summand.
For the remaining block , since we can apply Case and construct orthogonal isometries and define
(57) where now acts nontrivially only on the summand. We can check
-
(a)
Completeness:
(58) -
(b)
Orthogonality: For all and ,
(59) and for ,
(60) -
(c)
Kraus rank: The total number of Kraus operators is
(61)
-
(a)
Appendix C Proof of Lemma 4
Let us start by upper bounding
| (62) | ||||
where in (i) we have used the data-processing inequality in a similar way to (30), in (ii) we have leveraged the variational characterisation of the 1-norm and we have absorbed in , in (iii) we have recalled the identity
| (63) |
finally, in (iv) we have noticed that
| (64) | ||||
as .
Calling , we have
| (65) | ||||
where in (v) we have used the reverse triangle inequality, in (vi) we have leveraged the triangle inequality, in (vii) we have bounded as in (62), and in (viii) we have recalled the inequality . This completes the first part of the proof of Lemma 4. Now, we want to prove that
| (66) |
when we sample independent random unitaries . Let be an orthonormal basis for . For , let and be the Kraus operators obtained from the isometries and , respectively. Writing the trace on the system in terms of the basis , we get
| (67) | ||||
where in (viii) we have expanded
| (68) | ||||
and, for , we have computed
| (69) | ||||
leveraging the fact that , and .
The only inequality we are left to prove is
| (70) |
We have
| (71) | ||||
where in (ix) we have noticed that, by (24),
| (72) | ||||
Hence
| (73) | ||||
where in (x) and in (xi) we have leveraged the inequality multiple times.
Recalling that we defined and , we compute
| (74) | ||||
where in (xii) have expanded and we have leveraged the ciclicity of the trace; in (xiii) we have used Lemma 6 with , , and ; in (xiv) we have used the values given in Lemma 7. Combining (73) with (LABEL:eq:73), we get
| (75) | ||||
where in the last line we have recalled that . This concludes the proof.
Appendix D Weingarten Calculus
As we use a random channel constructed from sampling a -random unitary matrix in our lower bound proofs, we need some facts from Weingarten calculus in order to compute the corresponding expectation values with respect to the Haar measure. If is a permutation of , let denote the Weingarten function of dimension . The following lemma is useful for our results.
Lemma 6 ([gu2013moments]).
Let be a -distributed unitary -matrix and let be a sequence of complex -matrices. We have the following formula for the expectation value:
| (76) | ||||
where and, writing in terms of cycles as ,
| (77) |
We will also need some values of Weingarten function.
Lemma 7 ([collins2006integration]).
The function has the following values:
-
•
,
-
•
,
-
•
.