0% found this document useful (0 votes)
5 views

chapter16

Uploaded by

Đào Mạnh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

chapter16

Uploaded by

Đào Mạnh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

Chapter 16: Quantum Information Theory

In this lecture you will learn:

• Classical Information Theory and Entropy


• Classical Information Compression and Information Communication
• Von Neumann Entropy and Quantum Information Theory
• Holevo’s Theorem and Accessible Information
• HSW Theorem and Quantum Communication
• Classical Communication with Quantum States of Light
• Entanglement and Entropy

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Coding Theory
Consider the following possible values of a random variable X that is to be measured
and the corresponding a-priori probabilities:

Value of X Probability Coding #1


Suppose you make the measurement
a 1/32 000
N times
b 132 001
After you are done, you wish to tell
c 1/8 010 your friend about ALL the
d 1/4 011 measurement results
e 1/8 100 How many bits do you need to do
f 1/8 101 this?
g 1/16 110
h 1/4 111

Coding Scheme #1:


There are 8 possible outcomes of every measurement, so we need 3 bits to encode
all the outcomes of a single measurement, and all N outcomes can be encoded
using 3N bits

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Coding Theory and Information Compression
The coding scheme #1 does not take into account that some measurement outcomes
are very unlikely and some are much more likely

Value of X Probability Coding #1 Coding #2


a 1/32 000 00000
b 1/32 001 00001 Assign shorter codes to
c 1/8 010 011 more likely outcomes and
longer codes to less likely
d 1/4 011 10 outcomes
e 1/8 100 001
…..but such that any chain
f 1/8 101 010 of bits representing the N
g 1/16 110 0001 outcomes is uniquely
decodable!
h 1/4 111 11

Average number of bits required per outcome


= 2x(1/32)x5 + 1x(1/16)x4 + 3x(1/8)x3 + 2x(1/4)x2 = 2.69 bits !

Bits required to transmit the results of N measurements is 2.69N < 3N bits !

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Entropy and Information
Consider a random variable X with the following probability distribution:

Question: If you were to find


out the outcome of the random
variable (say after making a
measurement) then how much
p(x) information did you acquire?

Hint: How many bits do you


need, on average, to convey the
result of the measurement to
X your friend?

Answer should Answer should


be zero! be NOT be
zero!
p(x) p(x)

X X
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Information and Asymptotic Equipartition Property
Consider a random variable X with the probability distribution P(x)
X  a, b, c , d , e , f , g , h
Suppose we measure X exactly N times (N is very large) and we plan to send the results
to a friend

The results of these measurements is the sequence: x1, x 2 , x 3 ,....... x N


N
The joint a-priori probability for this sequence is: P  x1, x 2 , x 3 ,....... x N    P  x i 
i 1
1 1 N
  log2 P  x1, x 2 , x 3 ,....... x N     log2 P  x i 
N N i 1
As N→∞ then:

Limit 1 N Asymptotic
  log2 P  x i     P  x  log2 P  x   H  X  Equipartition
N   N i 1 x Property (AEP)
This means that as N→∞ ,
 NH  X 
P  x1, x 2 , x 3 ,....... x N   2
●This means that as N→∞ there can only be 2NH(X) different result sequences that are
probabilistically likely (and each one of them has the same a-priori probability)
●Therefore, we only need NH(X) bits to encode any result sequence that is likely to occur
●This means on average we need only H(X) bits per result to encode it
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Entropy and Information
Entropy:

The amount of information (in bits) that is gained by learning about the outcome of a
measurement of a random variable is given by the entropy function:

H  X     p  x  log2  p  x  
x
Equivalently, entropy is the minimum number of bits required on average to transmit
reliably the outcome of a measurement of the random variable

Case 1: Completely deterministic scenario!

HX   0

Case 2;

Completely random scenario!


1
px   x  a, b,......h
8
H  X   log2 8  3 bits
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Entropy and Information

Value of X Probability Coding #1 Coding #2


a 1/32 000 00000
b 1/32 001 00001
c 1/8 010 011
d 1/4 011 10
e 1/8 100 001
f 1/8 101 010
g 1/16 110 0001
h 1/4 111 11

H  X     p  x  log2  p  x  
x
 2.69 bits!

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Entropy and Data Compression
A classical message M consists of a very long sequence of letters yi :
M   y 1, y 2 , y 3 ,........y N 
in which each letter belongs to an alphabet A of k letters:

A  a1, a2 , a3 ,........ak 

In the message, each letter ai occurs with an a-priori probability pi

The entropy of the message is then:


k
H C     pi log2  pi 
i 1

Shannon’s Source Coding Theorem:

A classical message of N letters, as described above, can be reliably compressed to


just NH(C) bits and recovered with an error probability that approaches zero as the
message length N becomes large

C. Shannon, “A Mathematical Theory of Communication”, Bell System Technical


Journal, vol. 27, pp. 379–423 and 623–656, July and October 1948
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Entropy Maximizing Distributions
Continuous random variable:
What probability distribution maximizes H(X) subject to the constrains:

x  xo  x  x o 2 2   x  
Answer: A Gaussian (or Normal) distribution
2

 x  xo 

Px 
1
2 2
e 2 2

  x o , 2  HX  
1
2

log2 2 e 2 
Discrete random variable:
What probability distribution maximizes H(N) subject to the constrains:

n  no n  0,1,2,3.........
Answer: A Thermal (or Bose-Einstein) distribution
n
1  no   1 
P n    H  N   log2  1  no   no log2  1  
1  no  1  no   no

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Conditional Entropy

How much information can be obtained on average from learning about the outcome of
a measurement of a random variable Y if the outcome of the measurement of another
random variable X is known?

H Y | X     p  x   p  y | x  log2  p  y | x  
x y
  p  x  H Y | X  x 
x
   p  x , y  log2  p  y | x  
x ,y

Cases:

H Y | X   H Y  Iff X and Y are independent random variables

H Y | X   0 iff X completely determines Y

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Mutual Information

Difference between the information obtained on average from learning about the
outcome of a measurement of a random variable Y and the information obtained on
average from learning about the outcome of a measurement of a random variable Y if
the outcome of the measurement of another random variable X is known

I Y : X   H Y   H Y | X   I  X : Y   H  X   H  X | Y 

 p  x, y  
  p  x , y  log2  
x ,y  p  x  p  y  

Mutual information quantifies how much information one random variable conveys
about another random variable

Cases:

I Y : X   H Y   H Y | X   H Y  iff X completely determines Y

I Y : X   H Y   H Y | X   0 Iff X and Y are independent random


variables

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Signals and Degrees of Freedom
How many degrees of freedom do bandwidth-limited real classical signals have?
x t  x   

o o
t 2 B 2 B 

Answer: 2B real degrees of freedom per second (Nyquist Theorem) where B is the
single-sided signal bandwidth in Hertz (not radians)

Recall from Chapter 5 that real narrowband signals can always be written as:


x  t   Re a  t  e  iot  a  t   x1  t   ix 2  t 
x  t   x1  t  cos  ot   x 2  t  sin  ot 
So each time-domain sample of the signal carries information on two real degrees
of freedom

And by Nyquist theorem, a band-limited signal can have at max B independent


samples per second (where B is the single-sided bandwidth in Hertz)
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Classical Signals and Degrees of Freedom
x t 

 n
x n  x  t  
 B
t
One can sample the signal as follows and
then reconstruct the signal with these 1
samples
2 cos ot LPF 1B
 n
1 x1  t  x1  n   x1  t  
 B
x t  2 B 
1 x2  t 
 n
x2  n  x2  t  
2 B   B
2 sin ot LPF
1

x  t   x1  t  cos  ot   x 2  t  sin  ot 


1B
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Classical Signals and Degrees of Freedom
 n
x1  n   x1  t   LPF  sin  B  t  n B  
 B 1B  x1  n   x1  t 
n   B t  n B 
2 B 
 n
x2  n  x2  t    sin  B  t  n B  
 B 1B  x2  n  x2  t 
n   B t  n B 
2 B 
LPF
Construction of x1(t):

cos ot T 
1
x1  t  B

 sin  B  t  n B  
x2  t  x t    x1  n  cos  ot 
n   B t  n B 
 sin  B  t  n B  
  x2  n sin  ot 
sin ot n   B t  n B 
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Time Domain Basis
Note that the signal can be expended in an orthogonal time-domain basis set:

 sin  B  t  n B  
x t    x1  n  cos  ot 
n   B t  n B 
 sin  B  t  n B  
  x2  n sin  ot 
n   B t  n B 

The time-domain and time-localized functions,

sin  B  t  n B   sin  B  t  n B  


cos  ot  sin  ot 
 B t  n B  B t  n B 

form a complete orthogonal set that can be used to expand any band-limited signal
centered at frequencies ±

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Signals and Degrees of Freedom

x  t   Re a  t  e  iot  a  t   x1  t   ix 2  t 

x  t   x1  t  cos  ot   x 2  t  sin  ot 


Power of a narrowband signal:

1 2 1 1 2
P t   x1  t   x 22  t   a  t 
2 2 2
Total energy of a narrowband signal:
 1 1
E   dt P  t    dt x1  t    dt x 22  t 
2
 2  2 
1  1  1
   x1  n    x 22  n  
2
2  n  B n  B
Total energy is just half the energy of all the orthogonal sinc pulses in the signal:

 sin  B  t  n B    sin  B  t  n B  


x t    x1  n  cos  ot    x 2  n  sin  ot 
n   B t  n B  n   B  t  n B 

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Communication and Channel Capacity: AWGN Channel

Encoder Transmitter Channel Receiver Decoder

1 1
o o
2 B 2 B 

Suppose one needs to send a message over a narrowband communication channel

● One can map the message to the amplitudes of the two quadratures

● Note that one can send only 2B different quadrature values per second

 sin  B  t  n B    sin  B  t  n B  


x t    x1  n  cos  ot    x 2  n  sin  ot 
n   B t  n B  n   B  t  n B 

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Communication and Channel Capacity: AWGN Channel

Encoder Transmitter Channel Receiver Decoder

Suppose one sends N different quadratures through the channel in time N/2B:

y 1 , y  2 , y  3 , y  4  .............y  N 

The data to be transmitted and the mapping process will impart an a-priori probability
distribution P(y) for the quadrature amplitudes

We assume there is also an energy/power constrain on the input:


1 N 
 y  n    y pin  y  dy  BE  P = average power
2 2
N n 1 

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Communication and Channel Capacity: AWGN Channel

Encoder Transmitter Channel Receiver Decoder

AWGN Noise

The channel adds noise so that the received quadrature is:

z  n   y  n   f  n 

Where f[n] represents zero-mean white Gaussian noise:


M  Sf f  o  B
f  n   0 f  n  f  m   M n ,m
P  f     0, M 

Question: how much information (in bits) can be communicated over this channel
using these N quadratures?

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Communication and Channel Capacity: AWGN Channel

Encoder Transmitter Channel Receiver Decoder

Now consider an N-dimensional space


z  n   y  n   f  n 
 z 2 [ n ]  y 2  n   f 2  n 
2
 z [n ]  P  M  P  M 1 2

Represent each possible value of the received


quadrature as a point in an N-dimensional
space of radius:
 M 1 2
 P  M 1 2
The noise is represented by an error region of
radius:
 M 1 2
around each quadrature value that can be
received

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Communication and Channel Capacity: AWGN Channel

Encoder Transmitter Channel Receiver Decoder

The number of distinct quadratures that can be


received and distinguished from each other in the
presence of noise is equal to the number of non-
overlapping N-spheres of radius M that can be  P  M 1 2
packed in a N-sphere of radius P  M

 P  M N 2  P
N 2
Which equals =   1   M 1 2
 M N 2  M
So the information in bits that can be transferred
using N-quadratures is:
N 2
 P N  P
log2  1    log2  1  
 M 2  M

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Communication and Channel Capacity: AWGN Channel

Encoder Transmitter Channel Receiver Decoder

● So the information in bits that can be transferred


using N-quadratures is:
N 2
 P N  P
log2  1    log2  1    P  M 1 2
 M 2  M
● Then the information in bits that can be transferred
using one quadrature is:
1  P
 log2  1    M 1 2
2  M
● Since we can send 2B quadratures per second
through the channel, the information in bits that can
be transferred per second is:
 P C. Shannon, “A Mathematical Theory of
C  B log2  1   Communication”, Bell System Technical
 M Journal, vol. 27, pp. 379–423 and 623–656,
July and October 1948

C is called the capacity of the classical AWGN channel

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Mutual Information and Channel Capacity

Encoder Transmitter Channel Receiver Decoder

Y Z

The channel capacity (per usage) with input Y and output Z is defined as the maximal of
the mutual information over all possible input distributions taking into account all
realistic constrains (such as the power/energy constrain):

max max
C I Z :Y   H Z   H Z |Y 
pin  y  pin  y 

Shannon’s Noisy Channel Coding Theorem

Any amount of information (in bits) less than or equal to C can be reliably transmitted
and recovered per usage of a noisy channel with an error probability that approaches
zero as the number of uses of the channel becomes large

C. Shannon, “A Mathematical Theory of Communication”, Bell System


Technical Journal, vol. 27, pp. 379–423 and 623–656, July and October 1948
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Mutual Information and Channel Capacity: AWGN Channel

Encoder Transmitter Channel Receiver Decoder

The channel capacity (per usage) is more formally defined as the maximal of the mutual
information over all possible input distributions taking into account the power/energy
constrain:

max max
 y pin  y  dy  P
2
C I Z :Y   H Z   H Z |Y 
pin  y  pin  y  

For AWGN Channel:


f  n   0
z  n   y  n   f  n 
 f  n   f  m   M  n ,m
AWGN
max  max 1
C H  Z    dy pin  y  H  Z | Y  y   H  Z   log2  2 eM 
pin  y   pin  y  2

Mutual information will be maximized if the output Z is Gaussian, and Z will be Gaussian
if the input Y is Gaussian

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Mutual Information and Channel Capacity: AWGN Channel

Encoder Transmitter Channel Receiver Decoder

max 1
C H  Z   log2  2 eM 
pin  y  2
Mutual information will be maximized if the output Z is Gaussian, and Z will be Gaussian
if the input Y is Gaussian

z  n   y  n   f  n 
If:
p  f     0, M 
Then:
pout  z | y     y , M 
And then if:
pout  y     0, P 
Then:

pout  z    dy pout  z | y  pin  y     0, P  M 


ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Mutual Information and Channel Capacity: AWGN Channel

Encoder Transmitter Channel Receiver Decoder

For AWGN Channel:


max 1
C H  Z   log2  2 eM 
pin  y  2
So if we assume for input Y the a-priori probability distribution:
y2 Satisfies the constrain:
1 
pin  y   e 2P    0, P  
 y pin  y  dy  P
2
2 P

Then the output Z will have the probability distribution:
z2

1 2 P  M 
pout  z   e    0, P  M 
2  P  M 
And the channel capacity (per quadrature) becomes:

1 1 1  P
C  log2  2 e  P  M    log2  2 eM   log2  1  
2 2 2  M
Same as before!!
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Quantum Information: The Basics
The unit of quantum information is a “qubit” (not a bit):

  0  1

Unlike the classical bit, a qubit can be in a superposition of the two logical states at
the same time

The density operator:

The state of a quantum system is represented by a density operator ̂

Density operator for a pure state: ̂   

Density operator for a mixed state (i.e. an ensemble of pure states): ˆ   pi i i


i
Density operator for an ensemble of mixed states: ˆ   pi ˆ i
i

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Quantum Information: Von Neumann Entropy
The “information” content of a quantum state is related to the Von Neumann entropy:

S  ˆ    Tr  ˆ log2  ˆ  

The Von Neumann entropy plays three roles (that we know of so far):

1) It quantifies the quantum information content in qubits of a quantum state (i.e. the
minimum number of qubits needed to reliably encode the quantum state)

2) It also quantifies the classical information in bits that can be gained about the
quantum state by making the best possible measurement

3) It also quantifies the amount of entanglement in bipartite pure states

As you will see, the Von Neumann entropy will not always give the answer to the
question we will ask!

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Von Neumann Entropy: Some Properties
S  ˆ    Tr  ˆ log2  ˆ  
1) Suppose:

̂    A pure state

 S  ˆ   0
2) Suppose:

ˆ   pi i i An ensemble of pure ORTHOGONAL states


i

 S  ˆ     pi log2  pi   H = Shannon entropy of the ensemble


i

If the states in the ensemble were not all completely orthogonal then: S  ˆ   H

3) Suppose:

ˆ   pi ˆ i An ensemble of mixed states but the mixed states in


i the ensemble have support on ORTHOGONAL
spaces

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Von Neumann Entropy: Some Properties
3) Suppose:

ˆ   pi ˆ i An ensemble of mixed states but the mixed states in


i the ensemble have support on ORTHOGONAL
spaces
   
 S  ˆ    Tr  ˆ log2  ˆ     Tr   pi ˆ i  log2   p j ˆ j  
 i   j  
 
  Tr    pi ˆ i  log2  pi ˆ i   
i 
 
  Tr    pi  i  log2  pi    pi  i  log2   i   
 ˆ ˆ ˆ 
i 

   pi log2  pi    pi S  ˆ i   H   pi S  ˆ i 
i i i

4) Change of basis:

Entropy is invariant under a unitary transformation (or change of basis)

 
S U ˆ U   S  ˆ 

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Quantum Messages
A quantum message M consists of a very long sequence of letters (or quantum
states) ˆ i :
M  ˆ1,ˆ 2 ,ˆ 3 ,........ˆ N 
in which each letter belongs to an alphabet A of k letters:

A   ˆ1, ˆ 2 , ˆ 3 ,........ˆ k 

In the message, each letter ˆ i occurs with an a-priori probability pi

The density operator for each letter in the message is then:


k
ˆ   pi ˆ i
i 1

The density operator for the entire message of N letters is then:

ˆ N  ˆ  ˆ  ˆ  ..........  ˆ

Question: is it possible to compress this long message to a smaller Hilbert space


requiring fewer qubits without comprising the fidelity of the message?
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Quantum Fidelity
How can we tell if two quantum states are identical, similar, not so similar, etc?

Example: in classical information theory we can judge the similarity or difference


between random variables Y and X by the mean square difference:

 x  y 2   dx dy  x  y  P  x , y 
2 This is not the only
measure used

Quantum Fidelity:
Given two quantum states, ̂ and ˆ , the fidelity F, a measure of the closeness
between them, is generally defined as the quantity:

 
2
F  ˆ , ˆ    Tr ˆ   F ˆ , ˆ 
This is not the only
ˆ ˆ measure used
 
Example: Suppose,

ˆ   
ˆ   
2
F  ˆ , ˆ    
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Quantum Messages and Quantum Data Compression
k
M  ˆ1,ˆ 2 ,ˆ 3 ,........ˆ N  ˆ   pi ˆ i A   ˆ1, ˆ 2 , ˆ 3 ,........ˆ k 
i 1
A quantum message M consisting of a very long sequence of letters (or quantum
states) ˆ i :
M  ˆ1,ˆ 2 ,ˆ 3 ,........ˆ N 
can be compressed to NC qubits, in the limit of large N, without loss of fidelity,
where:
k
S ˆ   C  I ˆ   S ˆ    pi S  ˆ i 
i 1

The lower limit is achievable if the alphabet C represents pure states (not necessarily
orthogonal), or if the different letters in the alphabet commute

B. Schumacher, Phys. Rev. A 51, 2738 (1995)


M. Horodecki, Phys. Rev. A 57, 3364 (1998)
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Classical Information from Quantum Messages
Question: How much classical information in bits can be obtained from a quantum
message by making the best possible measurement?

Generalized quantum measurements and POVMs:

The most general measurements to obtain classical information from quantum states
can be described in terms of a complete set of positive Hermitian operators Fˆ j which
provide a resolution of the identity operator,
 Fˆ j  1̂
j
These generalized measurements constitute a positive operator valued measure
(POVM). The probability pk that the outcome of a measurement on a quantum
state ̂ will be k is given as,

pk  Tr ˆ Fˆk  
Example: For a photon number measurement on a quantum state of light in a
cavity, the POVM is formed by the operators n n and the probabilities are
given as:

p  n   Tr  ˆ n n 

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information from Quantum Messages

Suppose a quantum message is made up of qubits:

  0  1
The quantum state is specified by two complex numbers and each can take an value

But the classical information that can be extracted from the above qubit is just one bit!!

Suppose the sender send the following two states with a-priori probability 1/2 each:

0 1 1 2 0 
ˆ   
 0 1 2
One can use the following POVM:

Fˆ0  0 0 Fˆ1  1 1  Fˆ j  1̂
j

And

S ˆ   1 bit

Accessible information is only 1 bit!

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Quantum Cryptography: The BB84 Protocol

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


The Holevo Bound: Case of Separate Measurements
A theorem, stated by Jim Gordon (without proof in 1964), and proved by Holevo (1973),
gives an upper bound on the amount of classical information (in bits) that can be
gained from a quantum message of N letters by making the best possible measurement
on each letter individually:
k
M  ˆ1,ˆ 2 ,ˆ 3 ,........ˆ N  ˆ   pi ˆ i A   ˆ1, ˆ2 , ˆ3 ,........ˆ k 
i 1

ˆ N  ˆ  ˆ  ˆ  ..........  ˆ
N terms

For the above quantum message, the obtained classical information I  M : P  per letter
about the preparation of the message, using the optimal measurement scheme, is
bounded as follows:
max k
I  M : P   I ˆ   S ˆ    pi S  ˆ i 
ˆ
F i 1

The upper limit in Holevo’s theorem can be achieved if and only if the quantum states
of all the letters in the alphabet C commute, i.e.  ˆ i , ˆ k   0

J. P. Gordon, in Quantum Electronics and Coherent Light, edited by P. A. Miles,


Academic Press (1964).
A. S. Kholevo, Probl. Peredachi Inf. 9, 177 (1973).
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Proof of the Holevo Bound: Case of Separate Measurements
If all the letters in the alphabet commute then they can all be diagonalized using a

common orthonormal basis  , and: 
ˆ i   fi ,   What this means is that even if the
 quantum letter or state is known,
there is still entropy related to the


 
S  ˆ i     fi , log2 fi , outcome of measurements
performed on it since it is not a
pure state
So if we choose the POVM to be:

F̂   

Then the probability p of measuring  is:

    
 ˆ  
p  Tr  F  Tr   pi ˆ i      Tr   pi fi , 
 i  
       pi fi ,
 i ,   i

The entropy comes out to be:

S ˆ     p log2  p   H  M 

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


The Holevo Bound: Case of Separate Measurements
The maximum classical information is the mutual information between the measurement
result and the preparation of the quantum state of each letter in the message:

max
I M : P   H M   H M | P 
ˆ
F
 
   p log2  p    pi    fi , log2 fi , 
 i   
 
k
 S ˆ    pi S  ˆ i 
i 1

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


The Holevo Bound: Case of Block Measurements
k
M  ˆ1,ˆ 2 ,ˆ 3 ,........ˆ N  ˆ   pi ˆ i A   ˆ1, ˆ 2 , ˆ 3 ,........ˆ k 
i 1

ˆ N  ˆ  ˆ  ˆ  ..........  ˆ
N terms

If instead of separate measurements on each letter of the message, if one is allowed to


make optimal measurements on all N letters of the message at the same time then the
upper limit in Holevo’s theorem can be achieved even if the letters in the alphabet C do
not commute, i.e.  ˆ i , ˆ k   0

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over Quantum Channel

ˆ Channel E ˆ    Eˆ jˆ Eˆ j


j
Noise sources Loss
Bath

The channel is described by a trace preserving linear quantum operation E such that
the density operator of each letter at the output of the channel is related to the density
operator at the input of the channel by the relation:

ˆ  E ˆ    Eˆ jˆ Eˆ j  Eˆ j Eˆ j  1̂
j j

What is really happening is the following:

Initial state of the channel input and the bath ˆinitial  ˆ  B0 B0



i t
After propagation through the ˆ final  Uˆ ˆ  B0 B0  Uˆ  Uˆ  e 
channel; unitary time evolution

After trace over all bath degrees the 


ˆoutput  TrBath Uˆ ˆ  B0 B0  Uˆ  
channel output is:
  Eˆ jˆ Eˆ j
j
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Classical Information Over Quantum Channel

ˆ Channel E ˆ    Eˆ jˆ Eˆ j


j
The channel is described by a trace preserving linear quantum operation E such that
the density operator of each letter at the output of the channel is related to the density
operator at the input of the channel by the relation:

ˆ  E ˆ    Eˆ jˆ Eˆ j  Eˆ j Eˆ j  1̂
j j

Classical information over the channel is encoded in a quantum message M consists


of a very long sequence of letters (or quantum states) ˆ i :
M  ˆ1,ˆ 2 ,ˆ 3 ,........ˆ N 
in which each letter belongs to an alphabet A of k letters:

A   ˆ1, ˆ 2 , ˆ 3 ,........ˆ k 
In the message, each letter ˆ i occurs with an a-priori probability pi

The density operators for each letter in the message and of the full message are then:
k
ˆ   pi ˆ i ˆ N  ˆ  ˆ  ˆ  ..........  ˆ
i 1
Question: How much classical information in bits can be communicated over the
channel per letter?
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Classical Information Over Quantum Channel: Channel Capacity

ˆ Channel E ˆ    Eˆ jˆ Eˆ j


j

The Holevo-Schumacher-Westmoreland (HSW) Theorem:

The classical capacity of this quantum channel per letter is:

max k
C S  E ˆ     pi S  E  ˆ i  
pi i 1

The classical capacity of the quantum channel is achievable (even for non-commuting
letters in the message) if the receiver is allowed to make block measurements on all
received letters

Note:
This capacity is also called the fixed-alphabet product-state capacity, since 1) the
optimization is not performed over the choice of input letters ˆ i , and 2) the input
letters are not assumed to be entangled over multiple uses of the channel and
therefore the input density operator is in a tensor product form

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over a Photonic Channel: Photon Number
States and Photon Number Detection
Channel

M  ˆ1,ˆ 2 ,ˆ 3 ,........ˆ N  A   0 0 , 1 1 , 2 2 ,......... n n .........



ˆ   pin  n  n n ˆ N  ˆ  ˆ  ˆ  ..........  ˆ
n0
N terms
● The channel is bandlimited
1 1
● Optical pulses are used and the classical
information is encoded in the number of  o o
photons in each optical pulse
2 B 2 B 
From previous discussion, at max B such
pulses can be sent per second

Power Constrain:
  P
 pin  n  nBo  P  pin  n  n  no 
n 0 n 0 B  o

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over a Photonic Channel: Photon Number
States and Photon Number Detection
 
ˆ   pin  n  n n Channel ˆ   pin  n  n n
n0 n0
The maximum classical information transmitted over the channel per letter is then:
max
C S  E ˆ     pi S  E  ˆ i  
pi i 1 The ideal POVM is:
max
 S ˆ    pi S  ˆ i  Fˆn  n n
pi i 1
max      Fˆn  1̂
 S  p  n  n n    pin  n  S  n n 
pin  n   n  0 in
n
 n0
max   
 S   pin  n  n n 
pin  n   n  0  n
1  no 
The entropy is maximized for a thermal distribution of pin  n    
photons in every pulse! 1  no  1  no 

 
P  P  Bo  P
C  log2  1    log2 1    pin  n  n  no 
 B   o B   o  P  n 0 B  o

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over a Photonic Channel: Photon Number
States and Photon Number Detection
 
ˆ   pin  n  n n Channel ˆ   pin  n  n n
n0 n0

The capacity in bits per letter is:

(bits)
 P  P  Bo  C
C  log2  1    log2 1  
 B   o B   o  P 
P  B  o 
log2  1  
B  o  P 

 P 
The capacity in bits per second is: log2  1  
 B  o 

 P  P  B  o 
C  B log2  1    log2 1  
 B   o   o  P 

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over a Photonic Channel: Photon Number
States and Photon Number Detection
 
ˆ   pin  n  n n Channel ˆ   pin  n  n n
n0 n0
The High Power Limit:

In the limit P  Bo the capacity (bits/s) becomes:


 P 
C  B log2  1  
 B  o

Compare the above to the classical AWGN channel result:

 P 
C  B log2  1  
f f  o  
 BS 

In the limit P  Bo , the quantum channel result is as if it were a classical AWGN
channel with added white noise with a noise power spectral density of o

WHY???

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over a Photonic Channel: Photon Number
States and Photon Number Detection
The Low Power Limit:
In the limit P  Bo the capacity
(bits/s) becomes:
P  B  o 
C log2  
 o  P 

How do we understand the above result?


For small signal powers P, choose a transmission time
T long enough such that one photon gets transmitted
in time T . Then: o
P
T
If the channel bandwidth is B then the transmission time T can be divided into BT
time slots.

The transmitted photon can occupy any one of these time slots. The information in
bits transmitted per second by that one photon, and therefore the channel capacity,
becomes: log  BT 
2 P  B  o
C  log2  
T  o  P 
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Classical Information Over a Photonic Channel: Coherent States
and Photon Number Detection
ˆ   d 2 pin     Channel ˆ   d 2 pin    

M  ˆ1,ˆ 2 ,ˆ 3 ,........ˆ N  A   POVM

Fˆn  n n
ˆ   d 2 pin     ˆ N  ˆ  ˆ  ˆ  ..........  ˆ
 Fˆn  1̂
n
N terms
● The channel is bandlimited

● Optical pulses are used and the classical


information is encoded in the amplitude 1 1
quadrature of each optical pulse o o
From previous discussion, at max B such 2 B 2 B 
pulses can be sent per second

Power Constrain:
Bo  d 2 pin   
2
P

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over a Photonic Channel: Coherent States
and Photon Number Detection
ˆ   d 2 pin     Channel ˆ   d 2 pin    

The chosen POVM, given below, for detection is (possibly) not the optimal POVM
for the coherent state alphabet (as we will see later)

Fˆn  n n  Fˆn  1̂
n
Since channel capacity definition includes use of the optimal POVM, which we are
(possibly) not using, we just calculate the mutual information between channel input
and the detector output

The conditional probability of detecting n photons, given the input  , is:


2n 2n

  
 
2 2
pout  n |    Tr   Fˆn  e   pout  n   Tr ˆ Fˆn   d 2 pin    e  
n! n!
The optimal mutual information between the channel input I and the detector
output O is:

max
I O : I   H O   H O | I 
pin  
max  
   pout  n  log2  pout  n     d  pin    pout  n |   log2  pout  n |   
2
pin   n0 n0

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over a Photonic Channel: Coherent States
and Photon Number Detection
ˆ   d 2 pin     Channel ˆ   d 2 pin    

max  
I O : I     p  n  log2  pout  n     d  pin    pout  n |   log2  pout  n |   
2
pin   n  0 out n 0

The above needs to be maximized over p() under the power constrain:

Bo  d 2 pin   
2
P

The maximizing procedure turns out to be analytically cumbersome, but results in


the low power and high power limits are known

The Low Power Limit: P  Bo


Same as in the case of using
P  B  o 
I O : I   log2   photon number states
 o  P 
The High Power Limit: P  Bo
One half of the result in the case
B  P 
I O : I   log2  
of using photon number states
2  B   o WHY?!?
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Classical Information Over a Photonic Channel: Coherent States
and Balanced Heterodyne Detection
Channel
POVM
M  ˆ1,ˆ 2 ,ˆ 3 ,........ˆ N  A   F̂ 
1
 

ˆ   d 2 pin     ˆ N  ˆ  ˆ  ˆ  ..........  ˆ  d  Fˆ  1̂
2

N terms
● The channel is bandlimited

● Optical pulses are used and the classical


information is encoded in the two 1 1
quadratures of each optical pulse o o
From previous discussion, at max B such 2 B 2 B 
pulses can be sent per second

Power Constrain:
Bo  d 2 pin   
2
P

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over a Photonic Channel: Coherent States
and Balanced Heterodyne Detection
ˆ   d 2 pin     Channel ˆ   d 2 pin    
POVM: 1
 d  Fˆ  1̂
2
F̂   

The chosen POVM implies that both the field quadratures are measured simultaneously!

Since channel capacity definition includes use of the optimal POVM, which we are
(possibly) not using, we just calculate the mutual information between channel input
and the detector output

The conditional probability of detecting  , given the input  , is:


pout   |    Tr   Fˆ 
1
   
2
 
pout     Tr ˆ Fˆ   d 2 pin   pout   |  

The optimal mutual information between the channel input I and the heterodyne detector
output O is:
max
I O : I   H O   H O | I 
pin  
max
   d 2  pout    log2  pout       d 2 pin    d 2  pout   |   log2  pout   |   
pin  

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over a Photonic Channel: Coherent States
and Balanced Heterodyne Detection
ˆ   d 2 pin     Channel ˆ   d 2 pin    

1 2 1   
2
pout   |       e
 
Gaussian!!
2 2
  r  r 

  i  i 

With added noise
1 2 1 2  1 2 1 2  having a variance
pout   |    e e of 1/2 in each
2  1 2  2  1 2 
quadrature!!

To maximize I(O:I), we need pout() to be


pout      d  pin   pout   |  
2
Gaussian as well, and this is possible if pin() is
Gaussian and satisfies the power constrain:
P
 d  pin   
2 2

B  o
The maximizing yields (just like in the case of AWGN) the capacity per letter
(consisting of two quadratures):
1  P 2Bo  1 2   P 
C  2x log2    log2 1  
2  1 2   B   o
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Classical Information Over a Photonic Channel: Coherent States
and Balanced Heterodyne Detection
ˆ   d 2 pin     Channel ˆ   d 2 pin    

The max information per letter (consisting of two quadratures) is:

 P 
C  log2  1  
 B   o
And since one can send B letters per second, the capacity in bits/s is:

 P 
C  B log2  1  
 B   o
The result, although identical to the one obtained using photon number states and
photon number detection (in the high power limit), has more similarities with the
classical AWGN result if:

1) each quadrature of the input coherent state is assumed to be a classical


variable and,
2) the channel adds white Gaussian noise to each quadrature and,
3) the power spectral density of the added white Gaussian noise is
assumed to be 1/2(ħo)

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over a Lossy Photonic Channel: Number
States and Photon Number Detection
ˆ Channel E ˆ    Eˆ jˆ Eˆ j
 Loss j

ˆ   pin  n  n n E ˆ    pout  n  n n
n0 Bath
n0

Input Power Constrain: POVM


  Pin
 pin  n  nBo  Pin  pin  n  n  nin  Fˆn  n n
n 0 n 0 B  o

How do we model photon loss in the channel?


ˆ  E ˆ    Eˆ jˆ Eˆ j
j
Channel Power Transmissivity: T Channel Power Loss: 1-T
We know how a photon number state behaves in the presence of loss

 T  
n n! m nm
n  0   1 T m nm
B
m 0 m ! n  m! B

Binomial distribution of photons


n n!
n n E n n  nm
 T m 1  T  m m
m 0 m !  n  m  !
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Classical Information Over a Lossy Photonic Channel: Number
States and Photon Number Detection
ˆ Channel E ˆ    Eˆ jˆ Eˆ j
 Loss j

ˆ   pin  n  n n E ˆ    pout  n  n n
n0 Bath
n0
The conditional probability of detecting m photons at the output, given the input n , is:
n! nm
pout  m | n   T m 1  T   m  n, 0 otherwise 
m! n  m!

 pout  m    pout  m | n  pin  n 
n 0
Output photon number and power:
 PinT Pout
  pout  n  n  nout  ninT  
n0 B  o B  o
Input and output states:
   
ˆ   pin  n  n n  E   pin  n  n n 
n0  n 0 
 n n! nm
  pin  n   T m 1  T  m m
n0 m 0 m !  n  m  !

  pout  n  n n
n0
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Classical Information Over a Lossy Photonic Channel: Number
States and Photon Number Detection
ˆ Channel E ˆ    Eˆ jˆ Eˆ j
 Loss j

ˆ   pin  n  n n E ˆ    pout  n  n n
n0 Bath
n0

The channel capacity is (the POVM is optimal):


max
C S  E ˆ     pi S  E  ˆ i  
pi i 1
max      n 
 S  p  n  n n    pin  n  S   pout  m | n  m m 
pin  n   n  0 out  n 0  m 0 
It is not difficult to evaluate:

 n  n 1
S   pout  m | n  m m     pout  m | n  log2  p  m | n    log2 1  2 enT  1  T  
 m 0  m 0 2

The channel capacity becomes:

max  1 
C   pout  n  log2  pout  n     pin  n  log2 1  2 enT  1  T  
pin  n  n  0 n0 2

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over a Lossy Photonic Channel: Number
States and Photon Number Detection
ˆ Channel E ˆ    Eˆ jˆ Eˆ j
Loss j
 
ˆ   pin  n  n n Bath E ˆ    pout  n  n n
n0 n0

max   1
C   pout  n  log2  pout  n     pin  n  log2 1  2 enT  1  T  
pin  n  n  0 n0 2

We want to maximize the capacity we want output distribution to be thermal

The output distribution can be thermal if the input distribution is also thermal

n n
1  nin  1  ninT 
pin  n     pout  n    
1  nin  1  nin  1  ninT  1  ninT 

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over a Lossy Photonic Channel: Number
States and Photon Number Detection
ˆ Channel E ˆ    Eˆ jˆ Eˆ j
 Loss j

ˆ   pin  n  n n E ˆ    pout  n  n n
n0 Bath
n0

The optimal mutual information (bits per use) between the input I and the output O is:
max  1
I O : I     pout  n  log2  pout  n    log2 1   eninT  1  T  
pin  n  n 0 2
 1  1
 log2  1  ninT   ninT log2  1    log2 1   eninT  1  T  
 ninT  2
 P  Pout  B  o  1  Pout 
 log2  1  out   log2 1    log2 1   e  1  T  
 B  o  B   o  P out  2  B   o 
Here,  is a number with values between 1 and 2 and depends on the value of T

The Low Power Limit: Pout  B o The High Power Limit: Pout  Bo (T  1)
Pout  B  o  B  Pout 
I O : I   log2   I O : I   log2  
  o P
 out 
2  B   o
(Now bits/s) (Now bits/s)
WHY?!?
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Classical Information Over a Lossy Photonic Channel: Capacity
k
ˆ   pi ˆ i Channel E ˆ 
i 1 Loss
Bath POVM

Fˆ j  ?
The capacity of this quantum channel per letter is:
max k
C S  E ˆ     pi S  E  ˆ i  
pi i 1

Channel Power Transmissivity: T Channel Power Loss: 1-T

Input Power Constrain:


Tr ˆ nˆ  Bo  Pin Pout  PinT

 P  P  B  o 
C  log2  1  out   out log2  1  
 B   o B   o  P out 

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Classical Information Over a Lossy Photonic Channel: Capacity

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Quantum Information: Von Neumann Entropy
The “information” content of a quantum state is related to the Von Neumann entropy:

S  ˆ    Tr  ˆ log2  ˆ  

The Von Neumann entropy plays three roles (that we know of so far):

1) It quantifies the quantum information content in qubits of a quantum state (i.e. the
minimum number of qubits needed to reliably encode the quantum state)

2) It also quantifies the classical information in bits that can be gained about the
quantum state by making the best possible measurement

3) It also quantifies the amount of entanglement in bipartite states

As you will see, the Von Neumann entropy will not always give the answer to the
question we will ask!

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Quantifying Entanglement of Bipartite Pure States

Consider the following two states of two 2-level systems:

1
a  0 1  1 0 
2 A B A B

1
b   3 0 A 0 B  1 A 1 B 
2

They are both entangled

But which one is more entangled??

How can we quantify the level of entanglement of states?

The answer for at least pure states of bipartite systems seems to be available

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Entanglement as a Resource
1
a  0 1B  1A 0 
2 A B

Alice Bob

Entanglement between qubits possessed by Alice and Bob cannot be generated by


any local operations or measurements or performed by Alice or Bob on their
respective qubit or by classical communications between Alice and Bob (LOCC)

Entanglement can only be generated by a joint operation on both the qubits

Entanglement is a resource

Bell States:
1 1
a  0 1B  1A 0 B  c  0 0  1 1 
2 A
2 A B A B

1 1
b  0 1B  1A 0  d  0 0  1 1 
2 A B
2 A B A B

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Quantifying Entanglement of Bipartite Pure States

Suppose Alice and Bob would like to prepare n copies of an entangled state:

  0 A
0 B
 0 A
1B  1 A 0 B
 1 A 1B
Alice Bob

But what they already have in their possession are multiple copies of a Bell state
(doesn’t matter which one)

Suppose Alice and Bob use a minimum of kmin Bell states in their possession, and lots
of local operations on their respective qubits and classical communication between
each other (LOCC), and are able to generate n copies of the desired state 

Then can we use the ratio kmin/n as a measure of entanglement in the state  ??

i.e. how many Bell states does one need to use to generate one copy?

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Quantifying Entanglement of Bipartite Pure States
Suppose Alice and Bob have n copies of an entangled state:

  0 A
0 B
 0 A
1B  1 A 0 B
 1 A 1B
Alice Bob

But what they want are multiple copies of a Bell state (doesn’t matter which one)

Suppose Alice and Bob are able to prepare a maximum of kmax Bell states from the n
copies of the state  in their possession, with only local operations on their
respective qubits and classical communication between each other (LOCC)

Then can we use the ratio kmax/n as a measure of entanglement in the state  ??

i.e. how many Bell states does one generate per one copy?

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Quantifying Entanglement of Bipartite Pure States
Alice Bob

  0 A
0 B
 0 A
1B  1 A 0 B
 1 A 1B

It can be shown that in the limit n→∞,

limit kmax kmin


 S  ˆ A   S  ˆ B   E    This many Bell states

n n n go into or come out
of the above state
Where:

ˆ A  TrB    
ˆB  TrA    

The above expression for bipartite entanglement works even when the qubits
involved are not 2-level systems but any arbitrary multilevel systems

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


Quantifying Entanglement of Bipartite Pure States

Consider the following two states of two 2-level systems:

1
a  0 1  1 0 
2 A B A B

1
b   3 0 A 0 B  1 A 1 B 
2

They are both entangled

But which one is more entangled??

Answer:
E  a   1.0
All four Bell states are
maximally entangled
E  b   0.81
ECE 407 – Spring 2009 – Farhan Rana – Cornell University
Quantifying Entanglement of Bipartite Mixed States
Alice Bob

What is Alice and Bob share a mixed entangled state?


ˆ AB

What is the entanglement of this state?

How many Bell states can Alice and Bob distill from ˆ AB ?

How many Bell states are needed to prepare ˆ AB ?

We don’t know the general answers to the above questions!!!

ECE 407 – Spring 2009 – Farhan Rana – Cornell University


The Last Slide

ECE 407 – Spring 2009 – Farhan Rana – Cornell University

You might also like