Efficient Computation of the
DFT: Fast Fourier Transform
Algorithms
A s w e h av e o b se rv e d in th e p reced in g c h a p te r, th e D isc re te F o u rie r T ra n sfo rm
(D F T ) play s an im p o rta n t ro le in m an y a p p lic a tio n s o f digital signal processing,
in clu d in g lin e a r filterin g , c o rre la tio n an aly sis, a n d sp e c tru m analysis. A m ajor
re a so n fo r its im p o rta n c e is th e ex isten ce o f efficient a lg o rith m s fo r c o m p u tin g the
DFT.
T h e m ain to p ic o f this c h a p te r is th e d e sc rip tio n o f c o m p u ta tio n a lly efficient
a lg o rith m s fo r e v a lu a tin g th e D F T . T w o d iffe re n t a p p ro a c h e s a re d e sc rib e d . O n e is
a d iv id e -a n d -c o n q u e r a p p ro a c h in w hich a D F T o f size N , w h e re jV is a c o m p o site
n u m b e r, is re d u c e d to th e c o m p u ta tio n o f sm a lle r D F T s fro m w hich th e larg er
D F T is co m p u te d . In p a rtic u la r, w e p r e s e n t im p o rta n t c o m p u ta tio n a l alg o rith m s,
called fast F o u rie r tra n sfo rm (F F T ) a lg o rith m s, fo r c o m p u tin g th e D F T w h en the
size N is a p o w e r o f 2 a n d w h e n it is a p o w e r o f 4.
T h e se co n d a p p ro a c h is b a s e d o n th e fo rm u la tio n o f th e D F T as a lin ear
filterin g o p e ra tio n o n th e d a ta . T h is a p p ro a c h le a d s to tw o a lg o rith m s, th e G o ertzel
alg o rith m an d th e ch irp -z tra n sfo rm a lg o rith m fo r co m p u tin g th e D F T via linear
filterin g o f th e d a ta se q u e n c e .
6.1 EFFICIENT COMPUTATION OF THE DFT: FFT ALGORITHMS
In th is sectio n w e p re s e n t se v e ra l m e th o d s fo r c o m p u tin g th e D F T efficiently.
In view o f th e im p o rta n c e o f th e D F T in v a rio u s d ig ital sig n a l p ro cessin g ap
p licatio n s, such as lin e a r filtering, c o rre la tio n an aly sis, a n d s p e c tru m analysis, its
efficien t c o m p u ta tio n is a to p ic th a t h a s re c e iv e d c o n s id e ra b le a tte n tio n by m any
m a th e m a tic ia n s, e n g in e e rs, a n d a p p lie d scien tists.
B asically , th e c o m p u ta tio n a l p ro b le m fo r th e D F T is t o c o m p u te th e se q u en ce
{X(*)} o f N c o m p lex -v alu ed n u m b e rs g iv en a n o th e r se q u e n c e o f d a ta (x(n)} of
448
Sec. 6.1 Efficient Computation of the DFT: FFT Algorithms 449
le n g th N , a c c o rd in g to th e fo rm u la
N- 1
* (* ) = V Jt(n) W N
kn 0 < k < N -l (6.1.1)
w h ere
WN = e - }7* ,N (6.1.2)
In g e n e ra l, th e d a ta se q u e n c e x ( n ) is also a s su m e d to b e c o m p lex v alu ed .
S im ilarly , th e I D F T b ec o m e s
1 n -i
jr(n ) = — ^ X ( J k ) W ^ " * 0 < n < N —I (6.1.3)
^ *=o
Since th e D F T a n d ID F T involve b asically th e sa m e ty p e o f c o m p u ta tio n s, o u r
d iscu ssio n o f efficien t c o m p u ta tio n a l a lg o rith m s fo r th e D F T a p p lie s as w ell to th e
efficien t c o m p u ta tio n o f th e ID F T .
W e o b se rv e th a t fo r each v alu e o f k , d ire c t c o m p u ta tio n o f X ( k ) involves
N co m p lex m u ltip lic a tio n s ( 4 N real m u ltip lic a tio n s) a n d N — 1 co m p lex a d d itio n s
(4 JV -2 re a l a d d itio n s). C o n se q u e n tly , to c o m p u te all N v alu es o f th e D F T re q u ire s
jV2 c o m p lex m u ltip lic atio n s a n d N 2 — N c o m p lex ad d itio n s.
D ire c t c o m p u ta tio n o f th e D F T is b asically in efficien t p rim a rily b e c a u se it
d o e s n o t e x p lo it th e sy m m etry a n d p e rio d ic ity p ro p e rtie s o f th e p h a s e fa c to r WV
In p a rtic u la r, th e s e tw o p r o p e rtie s are:
S y m m etry p ro p e rty : Wk
N+N/2 = —W N
L (6.1.4)
P e rio d ic ity p ro p e rty : W#+N = W N
k (6.1.5)
T h e c o m p u ta tio n a lly efficient a lg o rith m s d e s c rib e d in th is se c tio n , k n o w n collec
tiv ely as fa st F o u rie r tra n sfo rm (F F T ) a lg o rith m s, e x p lo it th e s e tw o b asic p ro p e rtie s
o f th e p h a s e facto r.
6.1.1 Direct Computation of the DFT
F o r a co m p le x -v a lu e d se q u e n c e x ( n ) o f N p o in ts, th e D F T m a y b e ex p re sse d as
X R{k) = Y |* * ( n ) c o s ^ ~ p - + x / ( n ) s i n ^ ^ - j (6.1.6)
Xi(k) = - Y |x ,f ( n ) s in - x , ( n ) co s (6.1.7)
T h e d ire c t c o m p u ta tio n o f (6.1.6) a n d (6.1.7) re q u ire s :
L 2 N 2 e v a lu a tio n s o f trig o n o m e tric fu n ctio n s.
2. 4 N 2 re a l m u ltip lic atio n s.
450 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
3. AN ( N — 1) real additions.
4. A num ber o f indexing and addressing op eration s.
T hese operations are typical o f D F T com putational algorithm s. T he operations
in item s 2 and 3 result in the D F T values X K(k) and X i ( k ) . T h e indexing and
addressing op eration s are necessary to fetch the data x(n'), 0 < « < Ar — 1, and
the phase factors and to store th e results. T h e variety o f D F T algorithm s optim ize
each o f these com p utational p rocesses in a different way.
6.1.2 Divide-and-Conquer Approach to Computation of
the DFT
T h e d evelop m en t o f com p utationally efficient algorithm s for the D F T is m ade pos
sible if w e adopt a divide-and-conquer approach. T his approach is based on the
d ecom p osition o f an Af-point D F T in to su ccessively sm aller D F T s. T his basic ap
proach leads to a fam ily o f com putationally efficient algorithm s k n ow n collectively
as FFT algorithm s.
T o illustrate the basic n otion s, let us consider the com p utation o f an Appoint
D F T , where N can be factored as a product o f tw o integers, that is,
N = LM (6.1.8)
T h e assum ption that N is not a prim e num ber is n ot restrictive, sin ce w e can pad
any sequ en ce with zeros to ensure a factorization o f the form ( 6 . 1 .8 ).
N o w the seq u en ce j ( n ) , 0 < n < N — 1, can be stored in either a one
dim ensional array in d exed by n or as a tw o-d im en sion al array in d exed by I and
m, w here 0 < / < L — 1 and 0 < m < A / - l a s illustrated in Fig. 6.1. N o te that / is
the row index and m is the colum n index. T hus, the se q u en ce x (n ) can b e stored
in a rectangular array in a variety o f w ays, each o f which d ep en d s on the mapping
o f index n to the in d exes (/, m).
For exam p le, su p p ose that w e select the m apping
n = Ml + m (6.1.9)
T his leads to an arrangem ent in which the first row consists o f th e first M elem ents
o f x ( n ) , the secon d row consists o f the n ext M elem en ts o f x ( n ) , and so on, as
illustrated in Fig. 6.2 (a ). O n the other hand, the m apping
n — l + mL (6.1.10)
stores the first L elem en ts o f x ( n ) in the first colu m n, the n ext L elem en ts in the
second colum n, and so on , as illustrated in Fig. 6.2(b ).
A sim ilar arrangem ent can be u sed to store th e com p u ted D F T valu es. In
particular, the m apping is from the in d ex it to a pair o f in d ices (p , q), where
0 < p < L — 1 and 0 < q < M — 1. If w e se lect the m apping
k — Mp + q (6 .1 .1 1 )
Sec. 6.1 Efficient Computation of the DFT: FFT Algorithms 451
n ---------- 0 1 ... N - 1
m x(\) M2) JcCW-1)
(a)
column index
0 1 Af-1
row index
K
0 *(0,0) x(0,1)
1 * 1 .0 ) * 1 ,1 )
2 *(2,0) *2,1)
L- 1
(b)
Figure 6.1 T w o dim ensional data array for storing the sequence x(n). 0 < n £
N -l.
the D F T is stored on a row -w ise basis, w here the first row contains the first M
elem en ts o f the D F T X ( k ) , the second row contains the next set of M elem en ts,
and so on . O n the other hand, th e m apping
k = qL + p (6 . 1. 12)
results in a colu m n-w ise storage o f X (Jt), w here the first L elem en ts are stored in
the first colu m n, th e secon d se t o f L elem en ts are stored in the secon d colum n,
and so on.
N o w su p p ose that x ( n) is m apped in to the rectangular array x ( l , m ) and X( k )
is m app ed in to a corresp on d ing rectangular array X ( p , q). T h en the D F T can be
exp ressed as a d o u b le sum o ver th e elem en ts o f the rectangular array m ultiplied
b y th e corresp on d ing p h ase factors. T o b e specific, let us ad op t a colum n-w ise
m apping fo r x ( n ) g iven by (6.1.10) and th e row -w ise m apping for the D F T given
by (6.1.11). T hen
X ( p , q) = Y Y x{1' m ) W <“ p+qHmL+t) (6.1.13)
rrtacO 1=0
But
p+ q){m L+ i) _ ^M L rnp ^m L q
(6.1.14)
H o w ev er, W%mp = 1, W%*L = W $ L = W**, and W * pl = W?' = W[l
452 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap.
Row-wise n = Ml + m
M- 1
0 *0) M D *2) M M - 1)
1 MM) MM + 1) M M + 2) M2M - 1)
2 M 2M ) M 2M + 1) M 2M + 2) M 'iM - 1)
L - 1 x( (L- IW) M(L - 1 )Af + 1) x «L -l)M + 2) x{L M - 1)
(a)
Column-wise
M- 1
0 MO ) ML) M 2L) x((M — 1) L )
1 jt(1) M L+ 1) x(2L + 1) x((M - I )L + 1)
2 M2) M L + 2) M 2 L + 2) M (M -l)L + 2 )
L -l x(L - I) M U - 1) M IL - 1) MLM - 1)
(b)
Figure <L2 Two arrangements for the data arrays.
W ith th ese sim p lification s, (6.1.13) can be exp ressed as
L- l
X( p, q) = Y (6.1.15)
/■*0 ^|R«0
The expression in (6.1.15) in volves the com p utation o f D F T s o f length M and
length L. T o elab orate, let us subdivide th e com p utation into th ree steps:
L First, w e com p ute th e M -point D F T s
M-l
F(l,q) = £ x ( / ,m ) W ^ \ 0 < q < M - 1 ( 6 .1 .1 6 )
for each of the rows I = 0 ,1 ........ L - l .
Sec. 6.1 Efficient Computation of the DFT: FFT Algorithms 453
2 . S econ d , w e com p ute a new rectangular array G ( l , q ) defined as
(6.1.17)
3. F inally, w e com p ute the L -point D F T s
L-l
(6.1.18)
X ( p , q ) = J 2 G ^l ' ^ WL
for each colum n q = 0 , 1 , . . . , M — 1, o f the array G (l, q).
O n the surface it m ay appear that the com putational procedure outlined
above is m ore com p lex than the direct com p utation o f the D F T . H ow ever, let
us evalu ate the com putational com p lexity o f (6.1.15). T h e first step in volves the
com p utation o f L D F T s, each o f M points. H en ce this step requires L M 2 com
plex m ultiplications and L M { M — 1) com p lex additions. T h e secon d step requires
L M com p lex m ultiplications. Finally, the third step in the com p utation requires
M L 2 com p lex m ultiplications and M L ( L — 1) com p lex additions. T h erefore, the
com p utational com plexity is
C om plex m ultiplications: N ( M + L + 1)
(6.1.19)
C om plex additions: N ( M + L — 2)
w here N = M L . T hus the num ber o f m ultiplications has b een reduced from N 2
to N ( M + L + 1 ) and the num ber o f additions has b een reduced from N ( N — 1) to
N ( M + L — 2).
For exam p le, suppose that N = 1000 and w e select L = 2 and M = 500.
T h en , instead o f having to perform 106 com p lex m ultiplications via direct com pu
tation o f the D F T , this approach leads to 503,000 com p lex m ultiplications. This
rep resents a reduction by approxim ately a factor o f 2. T h e num ber o f additions is
also red u ced by ab ou t a factor o f 2.
W h en N is a highly com p osite num ber, that is, N can be factored in to a
product o f prim e num bers o f th e form
N = r\ r 2 ■■- r v (6.1.20)
then the d ecom p osition ab ove can b e rep eated (v - 1 ) m ore tim es. T his procedure
results in sm aller D F T s, w hich, in turn, lead s to a m ore efficient com putational
algorithm .
In effect, th e first segm en tation o f th e seq u en ce x ( n ) in to a rectangular array
o f M colu m ns w ith L elem en ts in each colu m n resu lted in D F T s o f sizes L and M .
Further d eco m p o sitio n o f th e data in effect in volves th e segm en tation o f each row
(or colu m n ) into sm aller rectangular arrays w hich result in sm aller D F T s. This
p roced u re term inates w h en N is factored in to its prim e factors.
Example 6.1.1
To illustrate this computational procedure, let us consider the computation of an
N = 15 point DFT. Since N = 5 x 3 = 15, we select L = 5 and M = 3. In other
454 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
words, we store the 15-point sequence *(n) column-wise as follows:
Row 1: *(0, 0) = *(0) *(0.1) = * (5 ) *(0, 2) = *(10)
Row 2: *(1,0) = * (1 ) *(1, 1) = jc(6) *(1,2) = *(11)
Row 3: *(2,0) = * (2 ) x ( 2 , 1) = x(7) x(2,2)=x(12)
Row 4: *(3,0) = * (3 ) *(3,1) = * (8 ) *(3,2) = .r{13)
Row 5: * (4,0) = * (4 ) *(4,1) = * (9 ) *(4,2) = *(14)
Now. we com pute the three-point DFTs for each of the five rows. This leads
to the following 5 x 3 array:
F(0, 0) F (O .l) F( 0.2)
F a , o) F O .l) F (1.2)
F( 2. 0) F( 2,1) F( 2.2)
F(3, 0) FO. 1) FO- 2)
F(4. 0) F (4.1) F( 4.2)
The next step is to multiply each of the terms F(l, q) by the phase factors
= M/jj. 0 < / < 4 and 0 < q < 2. This computation results in the 5 x 3 array:
Column 1 Column 2 Column 3
G (0.0) C(0. 1) C (0.2)
G (1,0) C ( l.l) G (l. 2)
G(2, 0) C(2. 1) C ( 2 .2)
G (3.0) G (3 ,1) G (3 .2)
G (4 .0) G (4 .1) G (4.2)
The final step is to com pute the five-point DFTs for each of the three columns.
This com putation yields the desired values of the D FT in the form
X (0.0) = X(0) X (0 ,1) = X (l) X (0,2) = X(2)
X (1,0) = X(3) X (1.1) = X (4) X (1,2) = X(5)
X (2.0) = X (6) X (2 .1) = X(7) X (2,2) = X(8>
X (3,0) = X (9) X (3,1) = X(10) X (3 ,2) = X ( ll)
X (4,0) = X (12) X (4,1) = X (13) X (4 ,2) = X(14)
Figure 6.3 illustrates the steps in the computation.
It is interesting to view the segm ented data sequence and the resulting D FT in
term s of one-dim ensional arrays. When the input sequence x(n) and the output DFT
X(jfc) in the two-dimensional arrays are read across from row 1 through row 5, we
obtain the following sequences:
IN PU T A R R A Y
*(0) x{5) *(10) *(1) *(6) *(11) x(2) *(7) *(12) *(3) *(8) *(13) x(4) *{9) *(14)
O U T PU T A R R A Y
X(0) X (l) X(2) X(3) X(4) X(5) X(6) X(7) X(8) X(9) X(10) X ( ll) X(12) X(13) X(14)
W e observe that the input data sequence is shuffled from the norm al order
in the com putation of the D FT. On the other hand, the output sequence occurs in
norm al order. In this case the rearrangem ent of the input data array is due to the
Sec. 6.1 Efficient Computation of the DFT: FFT Algorithms 455
Figure 63 Computation of N = 15-point DFT by means of 3-point and 5-point
DFTs.
segm entation of the one-dim ensional array into a rectangular array and the order in
which the D FTs are com puted. This shuffling of either the input data sequence or
the output D FT sequence is a characteristic of most FFT algorithms.
T o sum m arize, the algorithm that w e have introduced in volves the follow in g
com putations:
Algorithm 1
1. S tore the signal colu m n-w ise.
2. C om pute the Af-point D F T o f each row.
3. M ultiply the resulting array by the p h ase factors
4. C om p ute the L -point D F T o f each colum n
5. R ea d the resulting array row -w ise.
A n additional algorithm with a sim ilar com putational structure can b e o b
tained if th e input signal is stored row -w ise and the resulting transform ation is.
colu m n-w ise. In this case w e select as
n = Ml + m
(6.1.21)
k = qL + p
This ch o ice o f in d ices lead s to the form ula for the D F T in the form
X {p ,q ) = £ £ * (/,
msO 1*0
(6. 1.22)
M- l
urmp
= E > c WN
Thus w e obtain a secon d algorithm .
456 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
A lg o rith m 2
1. S to re th e sig n al row -w ise.
2. C o m p u te th e L -p o in t D F T a t each co lu m n .
3. M u ltip ly th e re su ltin g a rra y by th e fa c to rs W%m.
4. C o m p u te th e A f-point D F T o f e a c h row .
5. R e a d th e re su ltin g a rra y colum n-w ise.
T h e tw o a lg o rith m s given a b o v e h a v e th e sa m e c o m p lex ity . H o w e v e r, they
d iffer in th e a rra n g e m e n t o f th e c o m p u ta tio n s. In th e follow ing se c tio n s w e exploit
th e d iv id e -a n d -c o n q u e r a p p ro a c h to d e riv e fast a lg o rith m s w h e n th e size o f the
D F T is re stric te d to b e a p o w e r o f 2 o r a p o w e r o f 4.
6.1.3 Radix-2 FFT Algorithms
In th e p re c e d in g se ctio n w e d e sc rib e d fo u r a lg o rith m s fo r efficien t c o m p u ta tio n of
th e D F T b ased o n th e d iv id e -a n d -c o n q u e r a p p ro a c h . S uch an a p p ro a c h is applica
b le w h en th e n u m b e r N o f d a ta p o in ts is n o t a p rim e . In p a rtic u la r, th e a p p ro ach
is v ery efficien t w h en N is highly c o m p o s ite , th a t is, w h en N can b e fa c to re d as
N = r\r2ry • ■■rv, w h e re th e {r,} are prim e.
O f p a rtic u la r im p o rta n c e as th e case in w hich r i = r2 — ■■• = r v = r , so th at
N = r ' \ In such a case th e D F T s a re o f size r , so th a t th e c o m p u ta tio n o f the
N -p o in t D F T h a s a re g u la r p a tte rn . T h e n u m b e r r is called th e rad ix o f th e F F T
alg o rith m .
In th is se c tio n w e d escrib e rad ix -2 a lg o rith m s, w hich a re by far th e m ost
w idely u sed F F T alg o rith m s. R ad ix -4 a lg o rith m s a re d e sc rib e d in th e follow ing
sectio n .
L e t us c o n s id e r th e c o m p u ta tio n o f th e N — 2 V p o in t D F T by th e divide-
a n d -c o n q u e r a p p ro a c h specified by (6.1.16) th ro u g h (6.1.18). W e select M = N / 2
a n d L = 2. T h is se lectio n re su lts in a sp lit o f th e N -p o in t d a ta se q u e n c e in to two
// /2 - p o in t d a ta se q u e n c e s f \ ( n ) a n d f 2(ri), c o rre sp o n d in g to th e ev en -n u m b ered
a n d o d d -n u m b e re d sa m p le s o f x ( n ), resp ec tiv ely , th a t is,
/ i ( n ) = x ( 2 n)
N (6.1.23)
f i ( n ) = x (2n + 1), n = 0 ,1 ,..., — ~ 1
T h u s f i ( n ) a n d f j ( n ) a re o b ta in e d by d e c im a tin g x ( n) b y a f a c to r o f 2, a n d hence
th e re su ltin g F F T a lg o rith m is called a d e c im a tio n -in -tim e a lg o rith m .
N o w th e Af-point D F T c a n b e e x p re ss e d in te rm s o f th e D F T s o f th e deci
m a te d se q u e n c e s as follow s:
A'-l
X(k) = Y x ^ wn * = 0 ,1 ,..., A f-1
n*0
Sec. 6.1 Efficient Computation of the DFT: FFT Algorithms 457
= £ x (n )H # + 5 3 (6.1.24)
n even n odd
<W/2)-l (Af/2)-l
= J ! x ( 2 m ) W ] f k 4- 5 3 *(2m + l)W * (2'n+1>
m=0 m=0
B ut = W ^/ 2- W ith this su b stitu tio n , (6.1.24) c a n b e e x p re ss e d as
(N/Zi-i (N/2)—l
x(k)= 5 3 M m )w *ra + K £
= fi(Jfc) + W * f 2(*) it = 0 , 1 , . . . , W - l
w h e re F i(it) a n d F2 (k) a re th e N /2 -p o in t D F T s o f th e se q u e n c e s f \ ( m ) an d f 2 (m),
resp ec tiv ely .
S ince F\ ( k) a n d F 2 (k) a re p e rio d ic, w ith p e rio d N f l , w e h a v e F](A -f N / 2) =
F i(* ) a n d /^(jfc + N f l ) = / i t * ) - 1° a d d itio n , th e fa c to r W ^ Nfl = —Wfa. H e n c e
(6.1.25) can b e ex p re sse d as
X ( k ) = f i(J t) + W* F2(fc) * = 0 , 1 .........~ — 1 (6.1.26)
+ = F , ( * ) - < F 2(/:) * = 0 , 1 ....... y - 1 (6.1.27)
W e o b se rv e th a t th e d ire c t c o m p u ta tio n o f F\ ( k) re q u ire s ( N /2 )2 com plex
m u ltip lic a tio n s. T h e sa m e a p p lie s to th e c o m p u ta tio n o f F2 (k). F u rth e rm o r e , th e re
a re N f l a d d itio n a l co m p lex m u ltip lic a tio n s re q u ire d to c o m p u te W kN F 2 (k). H en ce
th e c o m p u ta tio n o f X ( k ) re q u ire s 2 ( N f l )1 4- N f l = N 2/ 2 + N f l c o m p lex m u ltip li
c a tio n s. T h is first s te p resu lts in a re d u c tio n o f th e n u m b e r o f m u ltip lic a tio n s from
N 2 to N 1 f l + N f l , w h ich is a b o u t a fa c to r o f 2 fo r N large.
T o b e c o n s iste n t w ith o u r p rev io u s n o ta tio n , w e m ay define
</!(*) = F l(* ) * = 0 ,l,...,y - l
G 2 (k) = W N
l F2 (k) * = 0 , l , . . . , y —1
T h e n th e D F T X (it) m ay b e ex p re sse d as
X ( k ) = G x(k) + G 2 (k) it = 0 , 1 ____ y - 1
(6.1.28)
X(k + j ) = G x( k ) - G 2 (k) * = 0 , 1 .........y - 1
T h is c o m p u ta tio n is illu stra te d in Fig. 6.4.
H a v in g p e rfo rm e d th e d e c im a tio n -in -tim e o n ce, w e c a n r e p e a t th e p ro cess
fo r e a c h o f th e se q u e n c e s f \ ( n ) a n d f 2 (n). T h u s f \ ( n ) w o u ld re s u lt in th e tw o
458 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
* 0 ) x{2) x(4) JrW -2)
Figure 6.4 F irst step in the decim ation-in-tim e algorithm .
/V /4-point se q u e n c e s
N
v n (n ) = / ] ( 2n) « = 0 , 1 ................... 1
4
(6.1.29)
v\ 2 (n) = f \ Q n + 1) n = 0, 1.........j - 1
an d f 2 (n) w o u ld yield
N
V2\(n) = f 2 (2 n) n = 0, 1.........— - 1
4
(6.1.30)
N
V22(n) = /2(2n + 1) n = 0 , 1.........— - 1
B y co m p u tin g jV /4-point D F T s , w e w o u ld o b ta in th e ///2 - p o in t D F T s Fi(Jfc) and
F2 (k) fro m th e re la tio n s
FiOfc) = V„(Jfc) + W k
Nf2 Vn (k) k = 0 ,1 , 1
(6.1.31)
Fx (* + t ) = Vl1{k) - KpynW k= 1..... 7 - 1
F 2 (k) = V21(*) + W N
k / 2 V22(k) k = 0 ,1 ,..., J - 1
(6.1.32)
N
F2 ( * + j ) = V2i (*) - k = 0, . , , , j - l
where the (Vi; (jt)} are the ///4 -p o in t D F T s o f th e seq u en ces {u,;(n)}.
Sec. 6.1 Efficient Computation of the DFT: FFT Algorithms 459
TABLE 6.1 COMPARISON OF COMPUTATIONAL COMPLEXITY FOR THE
DIRECT COMPUTATION OF THE DFT VERSUS THE FFT ALGORITHM
Number of Complex Multiplications Complex Multiplications Speed
Points, in Direct Computation, in FFT Algorithm, Improvement
N N2 (JV/2) log2 N Factor
4 16 4 4.0
8 64 12 5.3
16 256 32 8.0
32 1,024 80 12.8
64 4,096 192 21.3
128 16,384 448 36.6
256 65,536 1,024 64.0
512 262.144 2,304 113.8
1,024 1,048,576 5,120 204.8
W e o b se rv e th a t th e c o m p u ta tio n o f {V(J(*)} re q u ire s 4{W /4)2 m u ltip lic a tio n s
a n d h e n c e th e c o m p u ta tio n o f F\ ( k) a n d F 2(Jt) can b e a c c o m p lish ed w ith N 2/ 4 +
N f l c o m p lex m u ltip lic a tio n s. A n a d d itio n a l N f l co m p lex m u ltip lic a tio n s a re re
q u ire d to c o m p u te X ( k ) fro m F i(it) a n d Fi{k), C o n se q u e n tly , th e to ta l n u m b e r o f
m u ltip lic a tio n s is re d u c e d a p p ro x im a te ly by a fa c to r o f 2 ag ain to N 2/ 4 + N.
T h e d e c im a tio n o f th e d a ta se q u e n c e can b e re p e a te d ag ain a n d ag ain until
th e re su ltin g se q u e n c e s a re re d u c e d to o n e -p o in t se q u en ces. F o r N = 2 V, this
d e c im a tio n can b e p e rfo rm e d v = log2 N tim es. T h u s th e to ta l n u m b e r o f co m p lex
m u ltip lic a tio n s is re d u c e d to { N f l ) log2 N . T h e n u m b e r o f co m p lex a d d itio n s is
N log2 N . T a b le 6.1 p re s e n ts a c o m p a riso n o f th e n u m b e r o f co m p lex m u ltip lic a
tio n s in th e F F T a n d in th e d ire c t c o m p u ta tio n o f th e D F T .
F o r illu stra tiv e p u rp o se s , Fig. 6.5 d e p ic ts th e c o m p u ta tio n o f an N = 8 p o in t
D F T . W e o b se rv e th a t th e c o m p u ta tio n is p e rfo rm e d in th r e e stag es, b eg in n in g
w ith th e c o m p u ta tio n s o f fo u r tw o -p o in t D F T s, th e n tw o fo u r-p o in t D F T s , an d
Figure <L5 Three stages in the computation o f an N = 8-point DFT.
460 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
Stage Stage 2 Stage 3
X (0 )
X(l)
X(2)
X(3)
X(4)
X{5)
*(6)
X p )
finally, o n e eigh t-p oin t D F T . T he com bination o f the sm aller D F T s to form the
larger D F T is illustrated in Fig. 6. 6 for N = 8 .
O bserve that the basic com putation perform ed at every stage, as illustrated
in Fig. 6 .6 , is to take tw o com p lex num bers, say th e pair (a, b), m ultiply b by W N r,
and then add and subtract the product from a to form tw o new com p lex numbers
(A, B ). This basic com putation, which is show n in Fig. 6.7, is called a butterfly
b ecau se the flow graph resem bles a butterfly.
In general, each butterfly involves on e com p lex m ultiplication and tw o com
plex additions. F or N = 2 V, there are N f l butterflies per stage o f th e com putation
p rocess and log 2 N stages. T h erefore, as previously indicated th e total num ber of
com plex m ultiplications is ( N f l ) log 2 N and com p lex additions is Arlog 2 N .
O nce a butterfly operation is perform ed on a pair o f com p lex num bers (a, b)
to p roduce ( A , B ) , there is no n eed to 'sa v e the input pair ( a , b ) . H en ce w e can
>A = a + W^b
F igure 6.7 Basic butterfly com putation
in th e decim ation-in-tim e F FT
B=a-Wt/b
algorithm .
Sec. 6.1 Efficient Computation of the DFT: FFT Algorithms 461
store th e result (A , B ) in the sam e location s as ( a , b ) . C on sequ en tly, w e require
a fixed am ount o f storage, n am ely, 2 N storage registers, in order to store the
results ( N com p lex num bers) o f the com p u tation s at each stage. Since th e sam e
2 N storage loca tio n s are used throughout the com p utation o f the JV-point D F T ,
w e say that the c o m p u ta tio n s are d o n e in place.
A secon d im portant observation is con cern ed with the ord er o f the input
data seq u en ce after it is d ecim ated (v - 1 ) tim es. For exam p le, if w e consider
the case w h ere N = 8 , w e know that th e first d ecim ation yield s the seq u en ce
jc(0), x ( 2 ) , x (4 ), * ( 6 ), * (1 ), Jt(3), jr(5), jc(7), and the secon d d ecim ation results in
the seq u en ce jc(0), x (4 ), x (2 ), x ( 6 ), jt(1), x (5 ), jc(3), jc(7). T h is sh u fflin g o f the
input data seq u en ce has a w ell-d efin ed order as can b e ascertained from observing
Fig. 6 .8 , w hich illustrates the d ecim ation o f th e eigh t-p oin t seq u en ce. B y expressing
the in d ex n, in the seq u en ce x ( n ) , in binary form , w e n o te that th e order o f the
d ecim ated d ata seq u en ce is easily ob tain ed by reading the binary representation
o f th e index n in reverse order. T hus the data p oin t jt(3) = *(011) is placed in
position m = 110 or m = 6 in the d ecim ated array. Thus w e say that the data x ( n )
after d ecim ation is stored in bit-reversed order.
W ith th e input data seq u en ce stored in bit-reversed order and the butterfly
com p utations perform ed in p lace, the resulting D F T seq u en ce X ( k ) is ob tain ed
in natural order (i.e., k = 0 , 1 , . . . , N — 1). O n the oth er hand, w e should indi
ca te that it is p ossib le to arrange the F F T algorithm such that the input is left
in natural order and the resulting output D F T will occur in bit-reversed order.
Furtherm ore, w e can im pose the restriction that b oth the input data x ( n ) and the
output D F T X ( k ) be in natural order, and d erive an FFT algorithm in which the
com p utations are not d on e in place. H en ce such an algorithm requires additional
storage.
A n o th e r im portant radix-2 FFT algorithm , called th e decim ation -in-freq u en cy
algorithm , is o b tain ed by using the divide-and-conquer approach described in S ec
tion 6.1.2 w ith th e ch oice o f M = 2 and L = N f l . T his ch oice o f param eters
im plies a colu m n-w ise storage o f the input data seq u en ce. T o d erive the algo
rithm, w e begin by splitting the D F T form ula in to tw o sum m ations, o n e o f which
in v o lv es the sum over the first N t 2 data p oin ts and th e secon d sum in volves the
last N I 2 data points. Thus w e obtain
W 2 > -1 N- 1
X (k) = £ x( n)W *? + Y x(n)W %
(6.1.33)
Since W„N/2 = (—1)*, the exp ression (6.1.33) can b e rew ritten as
(6.1.34)
462 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
Data
decimation 1
Memory address Memory
(decimal) (binary)
0 000
(ninino) - (/To"2« () — (nnn in ;)
(0 0 0) (0 0 0 ) (0 0 0 )
(0 0 1) -» (1 0 0 ) -► (1 0 0)
(0 1 0) —*■ (0 0 1) -► (0 10)
(0 1 1) -► (1 0 1 ) -► (t 10)
(1 0 0) -► (0 10) ”* (0 0 1)
(1 0 !) -*> (1 1 0 ) -► (1 0 1)
(1 1 0) -*> (0 1 1) -► (0 1 1)
(1 1 1) —4 (1 1 1 ) (1 1 ))
(b)
Figure 6Jt Shuffling of the data and bit reversal.
N ow , let us split (d e c im a te ) X ( k ) in to th e ev en - a n d o d d - n u m b e re d sam p les. Thus
w e o b ta in
(W /2)-l r
v tr k n
x(n) + x N/2 k = Q, 1 , . . . , y - 1 (6.1.35)
K )
an d
(A 72)-l , r / JV \"1 1 N
X{2k + \) = £ + * = 0 , 1 ......y - 1
"“ ° (6.1.36)
w h ere w e h av e u se d th e fact th a t Wf, = 'Wsp.-
Sec. 6.1 Efficient Computation of the DFT: FFT Algorithms 463
If w e d e fin e th e N /2 -p o in t se q u e n c e s gi ( n) a n d gz(n) as
g i( n ) = * ( « ) + *
(6.1.37)
g 2 (n) = |* ( n ) - x + y^) K n = 0 , 1 , 2 .........J - 1
th e n
(N/2)-l
x ( 2 k) = Y s m K )2
n=0
(6.1.38)
(AT/2)—1
X (2* + l ) = & W WN/2
n=0
T h e c o m p u ta tio n o f th e s e q u e n c e s g i(n ) a n d g 2 (n) a c co rd in g to (6.1.37) a n d th e
su b s e q u e n t u se o f th e s e se q u e n c e s to c o m p u te th e N /2 -p o in t D F T s a re d e p ic te d in
Fig. 6.9. W e o b se rv e th a t th e b asic c o m p u ta tio n in th is figure in v o lv es th e b u tte rfly
o p e r a tio n illu stra te d in Fig. 6.10.
T h is c o m p u ta tio n a l p ro c e d u re can b e re p e a te d th ro u g h d e c im a tio n o f th e
N /2 -p o in t D F T s , X ( 2 k ) a n d X ( 2 k + 1). T h e e n tire p ro cess in v o lv es v = log2 N
464 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
A = a + b
Figure 6.10 Basic butterfly com putation
wi, in the decim ation-in-frequency F FT
B = ( a — b)W f/ algorithm .
- 1
stages o f decim ation , w here each stage in volves N t l butterflies o f the type shown in
Fig. 6.10. C onsequently, the com putation o f the Af-point D F T via the decim ation-
in-frequency FFT algorithm , requires ( N / 2) iog 2 N com p lex m ultiplications and
N lo g 2 N com p lex additions, just as in the d ecim ation -in-tim e algorithm . For il
lustrative purposes, the eight-point d ecim ation-in-frequency algorithm is given in
Fig. 6.11.
W e observe from Fig. 6.11, that the input data x ( n ) occurs in natural order,
but the output D F T occurs in bit-reversed order. W e also n ote that the com puta
tions are perform ed in place. H ow ever, it is p ossib le to reconfigure the decim ation-
in-frequency algorithm so that the input seq u en ce occurs in bit-reversed order
w hile the output D F T occurs in norm al order. F urtherm ore, if w e abandon the
requirem ent that the com putations b e d on e in place, it is also p ossib le to have
both the input data and the output D F T in norm al order.
Figure 6.11 N= 8-point decimation-in-frequency FFT algorithmn.
Sec. 6.1 Efficient Computation of the DFT: FFT Algorithms 465
6.1.4 Radix-4 FFT Algorithms
W h e n th e n u m b e r o f d a ta p o in ts N in th e D F T is a p o w e r o f 4 (i.e., N = 4 l ), w e
can , o f c o u rse, alw ays use a radix-2 a lg o rith m fo r th e c o m p u ta tio n . H o w e v e r, fo r
th is case, it is m o re efficien t c o m p u ta tio n a lly to em p lo y a rad ix -4 F F T alg o rith m .
L e t u s b eg in by d escrib in g a rad ix -4 d e c im a tio n -in -tim e F F T a lg o rith m , w hich
is o b ta in e d by se lectin g L = 4 an d M = N / 4 in th e d iv id e -a n d -c o n q u e r a p p ro a c h
d e s c rib e d in S ectio n 6.1.2. F o r this ch o ice o f L a n d M , w e h av e /, p — 0 ,1 , 2, 3: m,
q = 0, 1.........N J4 - 1; n = 4m + /; a n d k = ( N / 4) p + q. T h u s w e sp lit o r d e c im a te
th e W -point in p u t se q u e n c e in to f o u r su b s e q u e n c e s, x ( 4 n ), jc(4n + 1), x( 4n + 2),
x ( 4 n -f 3), n = 0, 1.........N / 4 — 1-
By a p p ly in g (6.1.15) w e o b ta in
3
* ( /> .« ) = £ [w ^F il'q ^W ? 0 ,1 .2 .3 (6.1.39)
w h ere F ( I . q ) is giv en by (6.1.16), th a t is.
(iV/4 |—! I = 0 .1 , 2. 3.
mq
F(l.q)= £ x ( l - n i ) W N/A N (6.1.40)
« = .........4 - 1
an d
x(l . m ) = x ( 4 m -j- /) (6.1.41)
(N
X(p.q) = X / — p + q (6.1.42)
T h u s, th e fo u r ///4 -p o in t D F T s o b ta in e d fro m (6.1.40) a re c o m b in e d a cco rd in g
to (6.1.39) to yield th e W -point D F T . T h e ex p re ssio n in (6.1.39) fo r co m b in in g
th e ///4 -p o in t D F T s d efin es a rad ix -4 d e c im a tio n -in -tim e b u tterfly , w hich can be
e x p re ss e d in m atrix fo rm as
~X( Q , q ) ‘ - 1 1 1 1 W °F(0,<7)
X(\,q) 1 ' j -1 j W«F(Lq)
(6.1.43)
X (2,q) 1 - 1 1 - 1 W % F(2 ,q)
-X(3,q)J Li j -i - j w l qF { X q )
T h e ra d ix -4 b u tte rfly is d e p ic te d in Fig. 6 .1 2 (a) a n d in a m o re co m p a c t fo rm
in Fig. 6 .1 2 (b ). N o te th a t since W® = 1, e a c h b u tte rfly in v o lv es th r e e co m p lex
m u ltip lic a tio n s , a n d 12 c o m p lex a d d itio n s.
T h is d e c im a tio n -in -tim e p ro c e d u re can b e re p e a te d recu rsiv ely v tim es. H e n c e
th e re su ltin g F F T alg o rith m co n sists o f v sta g es, w h e re e a c h sta g e c o n ta in s A74
b u tte rflie s . C o n s e q u e n tly , th e c o m p u ta tio n a l b u r d e n fo r th e a lg o rith m is 3 v N / 4 =
(3jV /8) lo g ; N c o m p lex m u ltip lic a tio n s a n d O N f l ) log2 N co m p lex a d d itio n s. W e
n o te th a t th e n u m b e r o f m u ltip lic a tio n s is re d u c e d by 2 5 % , b u t th e n u m b e r o f
a d d itio n s h a s in c re a se d b y 50% fro m N log2 N to O N f l ) log2 N .
466 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
Figure 6.12 Basic butterfly computation in o radix-4 FFT algorithm.
It is in te re stin g to n o te , h o w ev er, th a t by p e rfo rm in g th e a d d itio n s in tw o
step s, it is p o ssib le to re d u c e th e n u m b e r o f a d d itio n s p e r b u tte rfly fro m 12 to 8.
T h is can b e acco m p lish ed by e x p ressin g th e m atrix o f th e lin e a r tra n s fo rm a tio n in
(6.1.43) as a p ro d u c t o f tw o m atric es as follow s:
-1 0 1 0 - ‘1 0 1 0 w jjm g )
• m ? n
0 1 0 1 0 -1 0 WqF(\,q)
X ( l,9 ) -j (6.1.44)
X ( 2,q) 1 0 -1 0 0 1 0 1 W % F ( 2 .q)
.0 1 0 j - .0 1 0 -1
lW *F(3,q).
N ow ea c h m atrix m u ltip lic a tio n involves fo u r a d d itio n s fo r a to ta l o f e ig h t ad d i
tio n s. T h u s th e to ta l n u m b e r o f co m p lex a d d itio n s is re d u c e d to N log2 N , w hich
is id en tical to th e ra d ix -2 F F T a lg o rith m . T h e c o m p u ta tio n a l sa v in g s re su lts from
th e 25% re d u c tio n in th e n u m b e r o f co m p lex m u ltip lic atio n s.
A n illu stra tio n o f a rad ix -4 d e c im a tio n -in -tim e F F T a lg o rith m is sh o w n in
Fig. 6.13 fo r N = 16. N o te th a t in th is a lg o rith m , th e in p u t se q u e n c e is in norm al
o r d e r w h ile th e o u tp u t D F T is shuffled. In th e rad ix -4 F F T a lg o rith m , w here
th e d e c im a tio n is b y a f a c to r o f 4, th e o r d e r o f th e d e c im a te d se q u e n c e can be
d e te rm in e d b y re v e rsin g th e o r d e r o f th e n u m b e r th a t re p re s e n ts th e in d ex n
in a q u a te rn a ry n u m b e r sy stem (i.e., th e n u m b e r sy stem b a s e d o n th e digits 0,
1, 2, 3).
A rad ix -4 d e c im a tio n -in -fre q u e n c y F F T a lg o rith m can b e o b ta in e d by se lect
ing L = N / 4 , M = 4; /, p = 0, 1.........N / 4 - 1; m, q = 0, 1, 2, 3; n = {N / 4 ) m + /;
a n d k = 4 p + q. W ith th is ch o ice o f p a ra m e te rs , th e g e n e ra l e q u a tio n g iven by
Efficient Computation of the DFT: FFT Algorithms 467
Figure 6.13 Sixteen-point radix-4 decimation-in-time algorithm with input in nor
mal order and output in digit-reversed order.
(6.1.15) can be exp ressed as
(A y 4 )- l
Ip
X(p,q) = £ C ( l , q) W' NfA
l (6.1.45)
1=0
w here
q = 0 , 1 , 2, 3
G (l,q) = w ‘ hF (l, q) (6.1.46)
i - 0 . 1 .........£ - 1
4
and
q =0,1,2,3
F(l,q) = Y x ( l , m ) W ? N (6.1.47)
/ = 0 , 1 , 2 , 3 .........- - 1
4
W e n o te that X ( p , q ) = X ( 4 p + q ), q = 0, 1, 2, 3. C on sequ en tly, the Af-point
D F T is d ecim a ted into four N /4 -p o in t D F T s and h en ce w e have a decim ation-
in -frequency F F T algorithm . T h e com putations in (6.1.46) and (6.1.47) define
468 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
Figure 6.14 Sixteen-point, radix-4 decimation-in-frequency algorithm with input
in normal order and output in digit-reversed order.
the basic radix-4 butterfly for the d ecim ation-in-frequency algorithm . N o te that
the m ultiplications by the factors W% occur after the com b ination o f the data
p oin ts x (/, m), just as in the case o f th e radix-2 decim ation -in-freq u en cy algo
rithm.
A 16-point radix-4 decim ation -in-freq u en cy F F T algorithm is show n in
Fig. 6.14. Its input is in norm al order and its output is in digit-reversed order.
It has exactly the sam e com putational com p lexity as the d ecim ation -in-tim e radix-
4 F F T algorithm .
For illustrative purposes, let us rederive the radix-4 decim ation-in-frequency
algorithm by breaking the jV-point D F T form ula in to four sm aller D F T s. We
have
N- 1
X( k) = T x ( n ) W N
kn
n=0
JV/4-1 N/Z-1 3N/4-1 fif-l
= £ x { n ) W kNn + £ x{n)W%' + £ * (* )< " + £ x(n)W N
kn
n=0 n=N/4 n=Nfi n=JN/4
Sec. 6.1 Efficient Computation of the DFT: FFT Algorithms 469
E*("
/A» 74-
/ * ♦ - 1J A’// 4t -- iI
A / A/ N
= £ ,< „ ,< ■ + < « + t
h= (I «=o '
A'/4 —1 / \i \ N / A —\ / 'l \ ! \
+ < V/2 E * (« + y ) < + < A'/4 E A (" + t )
(6.1.48)
F ro m th e d efin itio n o f th e tw id d le facto rs, w e h av e
lNk/4
w\N (jf (6.1.49)
A fte r su b s titu tio n o f (6.1.49) in to (6.1.48). we o b ta in
N/ 4-1
xa)= E x(») + (
(6.1.50)
N
+ { -1 f x [ n + - J + ( ;) W\
T h e re la tio n in (6.1.50) is n o t an N /4 -p o in t D F T b e c a u se th e tw id d le facto r
d e p e n d s o n N a n d n o t on N / 4 . T o c o n v ert it in to an A '/4-point D F T , wc su b d iv id e
th e D F T se q u e n c e in to four /V /4-point su b se q u e n c e s, X ( 4 k ) . X ( 4 k + !), X (4£ + 2),
an d X { 4 k + 3), k — 0, 1........ N / 4 — 1. T h u s we o b ta in th e rad ix -4 d ecim atio n -in -
fre q u e n c y D F T as
x ( n ) + .v (6.1.51)
■ +? )
0 urkn
w"w
+a(', + i ) +j:(" + t ) N r r N /4
X (4k + l ) = E ■*(” ) ~ ix (” + "j} (6.1.52)
IV " w kn
N w yv/4
T ; 1r ( n \
X ( 4 k + 2) = E * (« )“ * ( « + j J (6.1.53)
W 2" w kn
- K M ' +t ) w Nf4
r / n \
X ( 4 k + 3) = E x (n '>+ J x ( " + J (6.1.54)
kn
~ x ( n ~h j ) - Jx (b+ t ) ] w^ S/4
470 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
w h ere w e h av e u se d th e p ro p e rty W^ kn = W^"4. N o te th a t th e in p u t to ea c h N/ 4-
p o in t D F T is a lin e a r co m b in a tio n o f fo u r signal sa m p le s sc aled by a tw id d le factor.
T h is p ro c e d u re is r e p e a te d v tim es, w h ere v = log,, N.
6.1.5 Split-Radix FFT Algorithms
A n in sp e ctio n o f th e radix-2 d e c im a tio n -in -fre q u e n c y flo w g rap h show n in Fig. 6.11
in d icates th a t th e e v e n -n u m b e re d p o in ts o f th e D F T can b e c o m p u te d in d e p e n
d en tly o f th e o d d -n u m b e re d p o in ts. T h is suggests th e p o ssib ility o f using d ifferen t
c o m p u ta tio n a l m e th o d s fo r in d e p e n d e n t p a rts o f th e a lg o rith m w ith th e ob jectiv e
o f re d u c in g th e n u m b e r o f c o m p u ta tio n s. T h e sp lit-ra d ix F F T (S R F F T ) alg o rith m s
ex p lo it th is id ea by u sing b o th a radix-2 a n d a radix-4 d e c o m p o s itio n in th e sam e
F I T alg o rith m .
W e illu stra te th is a p p ro a c h w ith a d e c im a tio n -in -fre q u e n c y S R F F T alg o rith m
d u e to D u h a m e l (1986). F irst, w e recall th a t in th e rad ix -2 d ec im a tio n -in -fre q u e n c y
F F T alg o rith m , th e e v e n -n u m b e re d sa m p le s o f th e /V -point D F T a re given as
N o te th a t th ese D F T p o in ts can b e o b ta in e d fro m an N /2 -p o in t D F T w ith o u t any
a d d itio n a l m u ltip lic atio n s. C o n se q u e n tly , a radix-2 suffices fo r th is c o m p u ta tio n .
T h e o d d -n u m b e re d sa m p le s {X{2k + 1)) o f th e D F T re q u ire th e p re m u ltip li
catio n o f th e in p u t se q u e n c e w ith th e tw id d le fa c to rs W N n . F o r th e s e sa m p les a
rad ix -4 d eco m p o sitio n p ro d u c e s so m e c o m p u ta tio n a l efficiency b e c a u se th e four-
p o in t D F T h as th e larg est m u ltip lic a tio n -fre e b u tterfly . I n d e e d , it can b e show n
th a t usin g a rad ix g r e a te r th a n 4, d o e s n o t re su lt in a significant re d u c tio n in com
p u ta tio n a l co m p lex ity .
I f we u se a rad ix -4 d e c im a tio n -in -fre q u e n c y F F T a lg o rith m fo r th e odd-
n u m b e re d sa m p le s o f th e /V -point D F T , w e o b ta in th e fo llo w in g N /4 -p o in t D FTs:
N/4-1
(6.1.56)
- j [ x ( n + N / 4 ) - x( n + 3 N / 4 )]}
A74-1
(6.1.57)
+ j [ x ( n + N / 4 ) - x{ n + 3 N / 4 ) ] } W ^
T h u s th e N -p o in t D F T is d e c o m p o se d in to o n e N /2 -p o in t D F T w ith o u t ad d itio n al
tw id d le facto rs a n d tw o N /4 -p o in t D F T s w ith tw id d le facto rs. T h e /V-p o in t D F T
is o b ta in e d by su ccessiv e u se o f th e s e d e c o m p o s itio n s u p to th e la st stag e. T hus
w e o b ta in a d e c im a tio n -in -fre q u e n c y S R F F T alg o rith m .
F ig u re 6.15 sh o w s th e flow g ra p h fo r a n in -p la ce 3 2 -p o in t d ecim atio n -
in -freq u en cy S R F F T a lg o rith m . A t stag e A of th e c o m p u ta tio n fo r N = 32, the
Sec. 6.1 Efficient Computation of the DFT: FFT Algorithms 471
A B
Figure 6,15 L ength 32 split-radix F F T algorithm s from p ap er by D uham el (1986); rep rin ted
w ith perm ission from the IE E E .
to p 16 p o in ts c o n s titu te th e se q u e n c e
go(«) = x ( n ) + x ( n + N / 2) 0 < n < 15 (6.1.58)
T h is is th e s e q u e n c e re q u ire d fo r th e c o m p u ta tio n o f X ( 2 k ) . T h e n e x t 8 p o in ts
c o n s titu te th e se q u e n c e
gi(n) = x(n) - x(n + N/ 2) 0<n<7 (6.1.59)
472 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
T h e b o tto m e ig h t p o in ts c o n s titu te th e se q u e n c e j g 2 (71). w h ere
82 (n) = x ( n + N / 4 ) — x ( n + 3 N / 4 ) 0 < « < 7 (6.1.60)
T h e se q u e n c e s gj ( n) a n d gi ( n) a re u sed in th e c o m p u ta tio n o f X ( 4 k 4 - 1) and
A'( 4 * + 3). T h u s, a t stag e A w e h ave c o m p le te d th e first d e c im a tio n fo r th e radix-2
c o m p o n e n t o f th e alg o rith m . A t sta g e B , th e b o tto m eig h t p o in ts c o n stitu te the
c o m p u ta tio n o f [# i(n ) + 7 ( ” )]^32 ->0 < /i < 7, w hich is u sed to c o m p u te X (4A -f 3),
0 < k < 7. T h e n ex t eig h t p o in ts fro m th e b o tto m c o n s titu te th e c o m p u ta tio n of
[tfi(n) — j g 2(h)] VVjj, 0 < n < 7, w hich is u se d to c o m p u te X ( 4 k 4 -1 ), 0 < k < 7.
T h u s a t stag e B , w e h av e c o m p le te d th e first d e c im a tio n fo r th e ra d ix -4 alg o rith m ,
w hich resu lts in tw o 8 -p o in t se q u e n c e s. H e n c e th e b asic b u tte rfly c o m p u ta tio n fo r
th e S R F F T a lg o rith m h a s th e “L -s h a p e d ” form illu stra te d in Fig. 6.16.
N ow w e r e p e a t th e ste p s in th e c o m p u ta tio n a b o v e. B e g in n in g w ith th e to p
16 p o in ts a t stag e A , w e r e p e a t th e d e c o m p o s itio n f o r th e 1 6 -p o in t D F T . In o th e r
w o rd s, w e d e c o m p o se th e c o m p u ta tio n in to an e ig h t-p o in t, rad ix -2 D F T an d tw o
fo u r-p o in t, rad ix -4 D F T s. T h u s a t sta g e B , th e to p eig h t p o in ts c o n s titu te the
se q u e n c e (w ith N = 16)
g'o(*) = 8o(n) + go(n 4- N / 2) 0 <n<7 (6.1.61)
an d th e n ex t eig h t p o in ts c o n s titu te th e tw o fo u r-p o in t se q u e n c e s g[(n) a n d jg'2(n),
w h ere
g[ (n) = go(n) ~ go(n + N f l ) 0 < n < 3
(6.1.62)
82 («) = 8o(n + N / 4 ) - g0(n + 3 N / 4 ) 0 < n < 3
T h e b o tto m 16 p o in ts o f sta g e B a re in th e fo rm o f tw o e ig h t-p o in t D F T s. H en ce
ea c h e ig h t-p o in t D F T is d e c o m p o s e d in to a fo u r-p o in t, rad ix -2 D F T a n d a four-
p o in t, rad ix -4 D F T . In th e final stag e, th e c o m p u ta tio n s in v o lv e th e co m b in atio n
o f tw o -p o in t se q u en ces.
T a b le 6.2 p r e s e n ts a c o m p a riso n o f th e n u m b e r o f nont ri vi al re a l m u ltip li
ca tio n s a n d a d d itio n s re q u ire d to p e rfo rm a n jY -point D F T w ith co m p lex -v alu ed
Sec. 6 .1 Efficient Computation of the DFT: FFT Algorithms 473
TABLE 6.2 NUMBER OF NONTRIVIAL REAL MULTIPLICATIONS AND
ADDITIONS TO COMPUTE AN N-POINT COMPLEX DFT
Real M ultiplications Real A dditions
Radix R adix Radix Split Radix Radix R adix Split
N 4 8 Radix 2 4 8 Radix
16 24 20 20 152 148 148
32 88 68 408 388
64 264 208 204 196 1.032 976 972 964
128 712 516 2.504 2308
256 1,800 1.392 1.284 5,896 5,488 5.380
512 4.360 3.204 3.076 13.566 12,420 12.292
1,024 10.248 7,856 7,172 30.728 28.336 27,652
Source: E xtracted from D uham el (1986).
d a ta , using a rad ix -2 , ra d ix -4, radix-8, a n d a sp lit-ra d ix F F T . N o te th a t th e S R F F T
alg o rith m re q u ire s th e lo w est n u m b e r o f m u ltip lic a tio n a n d a d d itio n s. F o r this
re a so n , it is p re fe ra b le in m an y p ra c tic a l a p p licatio n s.
A n o th e r ty p e o f S R F F T a lg o rith m has b e e n d e v e lo p e d by P rice (1990). Its
re la tio n to D u h a m e l’s a lg o rith m d e sc rib e d p rev io u sly can b e seen by n o tin g th a t
th e rad ix -4 D F T te rm s X ( 4 k 4- 1) an d X ( 4 k + 3) involve th e N /4 -p o in t D F T s o f th e
se q u e n c e s [g i(n ) - a n d [ # i( '0 + ,/£ 2(« )]W $ \ resp ec tiv ely . In effect, th e
se q u e n c e s g i(/i) a n d g 2 (n) are m u ltip lie d by th e fa c to r (v e c to r) (1, —j ) = (1, H ^ )
an d by WJJ fo r th e c o m p u ta tio n o f X ( 4 k + 1), w hile th e c o m p u ta tio n o f X (4k + 3)
in v o lv es th e fa c to r (1 , j ) = (1, W{2*) an d W j f . In s te a d , o n e can r e a rra n g e th e
c o m p u ta tio n so th a t th e fa c to r fo r X ( 4 k + 3) is ( —j , —1) = —(W £ 8, 1). A s a resu lt
o f th is p h a s e ro ta tio n , th e tw id d le fa c to rs in th e c o m p u ta tio n o f X (4k -f 3) b eco m e
ex actly th e sam e as th o se fo r X ( 4 k + 1), e x cep t th a t th e y o c c u r in m irr o r im age
o rd e r. F o r e x am p le, at sta g e B o f Fig. 6.15, th e tw id d le fa c to rs W21, W 18, . . . , W3
a re re p la c e d by (V1, W 2, . . . , W 1, resp ec tiv ely . T h is m irro r-im a g e sy m m etry occurs
a t ev ery s u b s e q u e n t sta g e o f th e alg o rith m . A s a c o n s e q u e n c e , th e n u m b e r o f
tw id d le facto rs th a t m u st b e c o m p u te d a n d s to re d is r e d u c e d by a fa c to r o f 2 in
c o m p a riso n to D u h a m e l’s a lg o rith m . T h e re su ltin g a lg o rith m is called th e “m irr o r ”
F F T (M F F T ) alg o rith m .
A n a d d itio n a l facto r-o f-2 savings in sto ra g e o f tw id d le fa c to rs can be o b ta in e d
b y in tro d u c in g a 90° p h a s e o ffset a t th e m id p o in t o f ea c h tw id d le a rra y , w hich can
b e re m o v e d if n ecessa ry a t th e o u tp u t o f th e S R F F T c o m p u ta tio n . T h e in co r
p o r a tio n o f th is im p ro v e m e n t in to th e S R F F T ( o r th e M F F T ) re su lts in a n o th e r
a lg o rith m , also d u e to P ric e (1990), c a lle d th e “p h a s e ” F F T (P F F T ) alg o rith m .
6.1.6 Implementation of FFT Algorithms
N o w th a t w e h av e d e s c rib e d th e b asic rad ix -2 a n d rad ix -4 F F T a lg o rith m s, let
us c o n s id e r so m e o f th e im p le m e n ta tio n issues. O u r r e m a rk s ap p ly d ire c tly to
474 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
rad ix -2 a lg o rith m s, a lth o u g h sim ilar c o m m e n ts m ay b e m a d e a b o u t rad ix -4 a n d
h ig h er-rad ix alg o rith m s.
B asically, th e rad ix -2 F F T a lg o rith m co n sists o f ta k in g tw o d a ta p o in ts a t a
tim e fro m m e m o ry , p e rfo rm in g th e b u tte rfly c o m p u ta tio n s a n d r e tu rn in g th e r e
su ltin g n u m b e rs to m em o ry . T h is p ro c e d u re is r e p e a te d m an y tim e s ( ( N log2 N) [ 2
tim es) in th e c o m p u ta tio n o f an JV-point D F T .
T h e b u tterfly c o m p u ta tio n s re q u ire th e tw id d le fa c to rs {W^} a t v a rio u s stages
in e ith e r n a tu ra l o r b it-re v e rse d o rd e r. In an efficien t im p le m e n ta tio n o f th e algo
rith m , th e p h ase fa c to rs a re c o m p u te d o n ce a n d s to re d in a ta b le , e ith e r in n o rm a l
o r d e r o r in b it-re v e rse d o rd e r, d e p e n d in g o n th e specific im p le m e n ta tio n o f th e
alg o rith m .
M e m o ry re q u ire m e n t is a n o th e r fa c to r th a t m u st b e c o n s id e re d . If th e c o m
p u ta tio n s a re p e rfo rm e d in p lace, th e n u m b e r o f m e m o ry lo c a tio n s re q u ire d is 2 N
since th e n u m b e rs a re com plex. H o w e v e r, w e can in ste a d d o u b le th e m e m o ry to
4N , th u s sim p lifying th e in d ex in g a n d c o n tro l o p e r a tio n s in th e F F T a lg o rith m s. In
th is case w e sim p ly a lte rn a te in th e use o f th e tw o se ts o f m e m o ry lo c a tio n s from
o n e sta g e o f th e F F T a lg o rith m to th e o th e r. D o u b lin g o f th e m e m o ry also allow s
us to h av e b o th th e in p u t se q u e n c e a n d th e o u tp u t se q u e n c e in n o rm a l o rd e r.
T h e re are a n u m b e r o f o th e r im p le m e n ta tio n issues re g a rd in g ind ex in g , bit
rev ersal, an d th e d e g re e o f p arallelism in th e c o m p u ta tio n s. T o a larg e ex ten t,
th e se issues a re a fu n ctio n o f th e specific a lg o rith m a n d th e ty p e o f im p le m e n ta
tio n , n am ely , a h a rd w a re o r so ftw are im p le m e n ta tio n . In im p le m e n ta tio n s b ased
o n a fix ed -p o in t a rith m e tic , o r flo atin g -p o in t a rith m e tic o n sm a ll m a ch in es, th e re
is also th e issue o f ro u n d -o ff e rro rs in th e c o m p u ta tio n . T h is to p ic is co n sid e re d
in S ectio n 6.4.
A lth o u g h th e F F T a lg o rith m s d e sc rib e d p re v io u sly w e re p re s e n te d in th e
c o n te x t o f co m p u tin g th e D F T efficiently, th e y can also b e u s e d to c o m p u te th e
ID F T , w hich is
j A '- l
* (") = 7 T ] C * ( (6' ll63)
*=0
T h e o n ly d iffe re n c e b e tw e e n th e tw o tra n sfo rm s is th e n o rm a liz a tio n fa c to r l / N
a n d th e sign o f th e p h a se fa c to r WN. C o n s e q u e n tly , a n F F T a lg o rith m fo r co m
p u tin g th e D F T , c a n b e c o n v e rte d to an F F T a lg o rith m fo r c o m p u tin g th e ID F T
by ch an g in g th e sign o n all th e p h a se fa c to rs a n d d iv id in g th e final o u tp u t o f th e
alg o rith m by N.
In fact, if w e ta k e th e d e c im a tio n -in -tim e a lg o rith m t h a t w e d e sc rib e d in
S ectio n 6.1.3, re v e rse th e d ire c tio n o f th e flow g ra p h , c h a n g e th e sign o n th e p h ase
facto rs, in te rc h a n g e th e o u tp u t a n d in p u t, a n d finally, d iv id e th e o u tp u t by N , w e
o b ta in a d e c im a tio n -in -fre q u e n c y F F T a lg o rith m fo r c o m p u tin g th e ID F T . O n th e
o th e r h a n d , if w e b eg in w ith th e d e c im a tio n -in -fre q u e n c y F F T a lg o rith m d escrib ed
in S ectio n 6.1.3 a n d r e p e a t th e ch an g es d e s c rib e d a b o v e , w e d b ta in a d ecim atio n -
in -tim e F F T a lg o rith m fo r c o m p u tin g th e ID F T . T h u s it is a sim p le m a tte r to devise
F F T a lg o rith m s fo r co m p u tin g th e ID F T .
Sec. 6.2 Applications of FFT Algorithms 475
F in ally , w e n o te th a t th e em p h asis in o u r discussion o f F F T a lg o rith m s w as
o n rad ix -2 , rad ix -4 , a n d sp lit-ra d ix a lg o rith m s. T h e se are by fa r th e m o st w idely
u se d in p ra c tic e . W h e n th e n u m b e r o f d a ta p o in ts is n o t a p o w e r o f 2 o r 4. it is a
sim p le m a tte r to p a d th e se q u e n c e x ( n ) w ith zero s such th a t /V = 2 1’ o r N = 4 '.
T h e m e a s u re o f co m p lex ity fo r F F T alg o rith m s th a t w e h av e e m p h a siz e d
is th e r e q u ire d n u m b e r o f a rith m e tic o p e ra tio n s (m u ltip lic a tio n s an d a d d itio n s).
A lth o u g h this is a v ery im p o rta n t b e n c h m a rk fo r c o m p u ta tio n a l co m p lex ity , th e re
a re o th e r issues to b e c o n s id e re d in p ractical im p le m e n ta tio n o f F F T a lg o rith m s.
T h e se in clu d e th e a rc h ite c tu re o f th e p ro cesso r, th e av ailab le in stru c tio n set, th e
d a ta s tru c tu re s fo r s to rin g tw id d le facto rs, an d o th e r c o n sid e ra tio n s.
F o r g e n e ra l-p u rp o s e c o m p u te rs, w h e re th e cost o f th e n u m e ric a l o p e ra tio n s
d o m in a te , rad ix -2 , rad ix-4, an d sp lit-ra d ix F F T a lg o rith m s a re go o d c a n d id a te s.
H o w e v e r, in th e case o f sp e c ia l-p u rp o se digital signal p ro c e ss o rs , fe a tu rin g sin g le
cycle m u ltip ly -a n d -a c c u m u la te o p e ra tio n , b it-re v e rse d ad d re ssin g , a n d a high d e
g ree o f in stru c tio n p a ra lle lism , th e stru c tu ra l re g u la rity o f th e a lg o rith m is eq u ally
im p o rta n t as a rith m e tic co m p lex ity . H e n c e fo r D S P p ro c e sso rs, rad ix -2 o r radix-
4 d e c im a tio n -in -fre q u e n c y F F T a lg o rith m s are p re fe ra b le in te rm s o f sp e e d an d
a ccu racy . T h e irre g u la r stru c tu re o f th e S R F F T m ay r e n d e r it less su ita b le fo r
im p le m e n ta tio n o n d ig ital signal pro cesso rs. S tru c tu ra l re g u la rity is also im p o rta n t
in th e im p le m e n ta tio n o f F F T a lg o rith m s on v e c to r p ro c e sso rs, m u ltip ro c e sso rs,
a n d in V L SI. I n te rp ro c e s s o r co m m u n icatio n is an im p o rta n t c o n s id e ra tio n in such
im p le m e n ta tio n s o n p a rallel pro cesso rs.
In co n clu sio n , we h av e p re s e n te d several im p o rta n t c o n s id e ra tio n s in th e
im p le m e n ta tio n o f F F T a lg o rith m s. A d v an ce s in digital signal p ro cessin g te c h n o l
ogy, in h a rd w a re a n d so ftw are, will c o n tin u e to influence th e ch o ice a m o n g F F T
a lg o rith m s fo r v a rio u s p ractical ap p licatio n s.
6.2 APPLICATIONS OF FFT ALGORITHMS
T h e F F T a lg o rith m s d e sc rib e d in th e p re c e d in g se ctio n find a p p lic a tio n in a v ariety
o f a re a s , in clu d in g lin e a r filtering, c o rre la tio n , a n d s p e c tru m analysis. B asically,
th e F F T a lg o rith m is u sed as a n efficient m e a n s to c o m p u te th e D F T a n d th e ID F T .
In th is se c tio n w e c o n s id e r th e u se o f th e F F T a lg o rith m in lin e a r filterin g
a n d in th e c o m p u ta tio n o f th e c ro ssco rrelatio n o f tw o se q u e n c e s. T h e use o f th e
F F T in s p e c tru m an aly sis is c o n sid e re d in C h a p te r 12. In a d d itio n w e illu strate
h o w to e n h a n c e th e efficiency o f th e F F T a lg o rith m by fo rm in g co m p le x -v a lu e d
se q u e n c e s fro m re a l-v a lu e d se q u e n c e s p rio r to th e c o m p u ta tio n o f th e D F T .
6.2.1 Efficient Computation of the DFT of Two Real
Sequences
T h e F F T a lg o rith m is d esig n e d to p e rfo rm co m p lex m u ltip lic a tio n s a n d a d d itio n s,
ev en th o u g h th e in p u t d a ta m ay b e re a l v alued. T h e b asic re a s o n fo r th is s itu a tio n is
476 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
th a t th e p h a se fa c to rs a re co m p lex a n d h e n c e , a fte r th e first sta g e o f th e alg o rith m ,
all v a riab les a re b asically co m p lex -v alu ed .
In view o f th e fact th a t th e a lg o rith m can h a n d le c o m p le x -v a lu e d in p u t se
q u en ces, w e c a n ex p lo it th is cap ab ility in th e c o m p u ta tio n o f th e D F T o f tw o
re a l-v a lu e d se q u e n c e s.
S u p p o se th a t * i(n ) a n d x 2(n) are tw o re a l-v a lu e d se q u e n c e s o f len g th N , and
let x( n ) b e a co m p le x -v a lu e d se q u e n c e d efin ed as
x ( n ) = X] (n) + j x 2 (n) 0 < n < N —1 (6.2.1)
T h e D F T o p e ra tio n is lin e a r a n d h en ce th e D F T o f x ( n ) can be ex p re sse d as
X ( k ) = X ](k) + j X 2(k) (6.2.2)
T h e se q u e n c e s ^ i(« ) an d J 2 OO can be ex p re sse d in te rm s o f x ( n ) as follow s:
, 4 ac(H) + J:*(n)
* i(« ) = -------- 2-------- (6.2.3)
x(n)-x*(n)
•*:(«) = ------- tt.-------- (6.2.4)
H e n c e th e D F T s o f jri(n ) an d x 2(n) are
* ,( * ) = -] \ D F T [ x { n ) } + D F T [ x \ n ) } ) (6.2.5)
X 2(k) = j - \ D F T [ x ( n )] - DF T [ x * ( n ) ] ) (6.2.6)
R ecall th a t th e D F T o f x*( n) is X * ( N — k). T h e re fo re ,
X] (k) = i[X (* r) + X * ( N - jt)] (6.2.7)
X 2(k) = -^ [X (* > - X * ( N - *)] (6.2.8)
;2
T h u s, by p e rfo rm in g a single D F T o n th e co m p le x -v a lu e d se q u e n c e x ( n ), we
h av e o b ta in e d th e D F T o f th e tw o re a l se q u e n c e s w ith only a sm all a m o u n t of
a d d itio n a l c o m p u ta tio n th a t is involved in co m p u tin g Xi (Jt) a n d X 2 (k) fro m X(k)
by u se o f (6.2.7) a n d (6.2.8).
6.2.2 Efficient Computation of the DFT of a 2/V-Point
Real Sequence
S u p p o se th a t g( n) is a re a l-v a lu e d se q u e n c e o f 2 N p o in ts. W e n o w d e m o n s tra te
h o w to o b ta in th e 2 N -p o in t D F T o f g( n) fro m c o m p u ta tio n o f o n e A ppoint D F T
involving c o m p le x -v a lu e d d a ta . F irst, w e define
* i(n ) = g(2 n)
(6.2.9)
*2(n) = g ( 2 n + 1)
Sec. 6.2 Applications of FFT Algorithms 477
T h u s w e h a v e su b d iv id e d th e 2 N -p o in t re a l se q u e n c e in to tw o W -point real se
q u e n c e s. N o w w e can ap p ly th e m e th o d d escrib ed in th e p re c e d in g sectio n .
L et jc(n) b e th e A7-p o in t c o m p lex -v alu ed se q u e n c e
A-(n) = * i ( n ) + j x i i n ) (6 .2 .10)
F ro m th e re su lts o f th e p re c e d in g se ctio n , w e h av e
x m = ^ [* (* ) + * * ( * - * ) ]
j (6.2.11)
X 2(k) = — [ X( k) - X * ( N - k)]
F inally, w e m u st ex p re ss th e 2/V -point D F T in te rm s o f th e tw o /V -point D F T s,
Xi(A) a n d X 2(k). T o acco m p lish this, w e p ro c e e d as in th e d e c im a tio n -in -tim e F F T
a lg o rith m , n am ely ,
N -1 N-1
C( k ) = £ s < 2 h ) H $ * + J 2 s ( 2 n + ^ W7 N ^ k
n=tl n=0
N- l N-1
«=() n=()
C o n s e q u e n tly ,
G( k ) = X t (k) + W i N X 2(k) k = 0 . 1 ..........N - 1
( 6 . 2 . 12 )
G( k + N ) = X i ( k ) - W%N X 2(k) k = Q . \ ..........N - l
T h u s w e h av e c o m p u te d th e D F T o f a 2/V -point real se q u e n c e from o n e jV-point
D F T an d so m e a d d itio n a l c o m p u ta tio n as in d icated by (6.2.11) an d (6.2.12).
6.2.3 Use of the FFT Algorithm in Linear Filtering and
Correlation
A n im p o rta n t ap p lic a tio n o f th e F F T a lg o rith m is in F IR lin e a r filterin g o f lo n g
d a ta se q u e n c e s. In C h a p te r 5 w e d e sc rib e d tw o m e th o d s, th e o v e rla p -a d d an d th e
o v e rla p -sa v e m e th o d s fo r filterin g a lo n g d a ta se q u e n c e w ith an F I R filter, b a s e d
o n th e u se o f th e D F T . In th is se ctio n w e c o n sid e r th e u se o f th e s e tw o m e th o d s
in c o n ju n c tio n w ith th e F F T a lg o rith m fo r co m p u tin g th e D F T an d th e ID F T .
L e t h( n), 0 < n < M - 1 , b e th e u n it sa m p le re sp o n s e o f th e F IR filter an d let
x ( n ) d e n o te th e in p u t d a ta se q u e n c e . T h e block size o f th e F F T alg o rith m is N ,
w h e re N = L + M — 1 an d L is th e n u m b e r o f new d a ta sa m p le s b e in g p ro cessed
by th e filter. W e a ssu m e th a t fo r a n y given v alu e o f Af, th e n u m b e r L o f d a ta
sa m p le s is se le c te d so th a t N is a p o w e r o f 2. F o r p u rp o se s o f th is discussion, w e
c o n s id e r o n ly rad ix -2 F F T alg o rith m s.
T h e /V -point D F T o f h(n), w hich is p a d d e d b y L — 1 z e ro s, is d e n o te d as H( k ) .
T h is c o m p u ta tio n is p e rfo rm e d o n c e via th e F F T an d th e re su ltin g N co m p lex
n u m b e rs a r e sto re d . T o be specific w e a ssu m e th a t th e d e c im a tio n -in -fre q u e n c y
478 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
F F T a lg o rith m is u se d to c o m p u te H( k ) . T h is y ields H ( k ) in b it-re v e rse d o rd e r,
w hich is th e w ay it is s to re d in m em ory.
In th e o v e rlap -sav e m e th o d , th e first M —1 d a ta p o in ts o f e a c h d a ta b lo ck are
th e last M — 1 d a ta p o in ts o f th e p rev io u s d a ta b lo ck . E a c h d a ta b lo c k c o n ta in s L
new d a ta p o in ts, su ch th a t N = L + M — 1. T h e N -p o in t D F T o f ea c h d a ta block
is p e rfo rm e d by th e F F T alg o rith m . If th e d e c im a tio n -in -fre q u e n c y alg o rith m is
e m p lo y ed , th e in p u t d a ta b lo ck re q u ire s n o shuffling a n d th e v a lu e s o f th e D F T
o ccu r in b it-re v e rse d o rd e r. S ince th is is ex actly th e o r d e r o f H ( k ) . w e can m ultiply
th e D F T o f th e d a ta , say Xm(fc), w ith //(Jt) a n d th u s th e re su lt
Ym(k) = H ( k ) X m(k)
is also in b it-re v e rse d o rd e r.
T h e in v erse D F T (ID F T ) can b e c o m p u te d by use o f an F F T alg o rith m th a t
ta k e s th e in p u t in b it-re v e rse d o r d e r a n d p ro d u c e s an o u tp u t in n o rm al o rd er.
T h u s th e r e is n o n e e d to shuffle any b lo ck o f d a ta e ith e r in c o m p u tin g th e D F T
o r th e ID F T .
If th e o v e rla p -a d d m e th o d is used to p e rfo rm th e lin e a r filterin g , th e co m p u
ta tio n a l m e th o d u sin g th e F F T a lg o rith m is basically th e sa m e. T h e only differen ce
is th a t th e N -p o in t d a ta b lo ck s consist o f L new d a ta p o in ts a n d M — 1 a d d itio n a l
zero s. A fte r th e I D F T is c o m p u te d fo r ea c h d a ta b lo ck , th e W -point filtered blocks
a re o v e rla p p e d as in d ic a te d in S ectio n 5.3.2, a n d th e M - 1 o v e rla p p in g d a ta p o in ts
b e tw e e n successive o u tp u t re c o rd s a re a d d e d to g e th e r.
L et u s assess th e c o m p u ta tio n a l co m p lex ity o f th e F F T m e th o d fo r lin e a r fil
terin g . F o r th is p u rp o se , th e o n e -tim e c o m p u ta tio n o f H ( k ) is in sig n ifican t an d can
b e ig n o red . E ach F F T re q u ire s ( N / 2) log2 N co m p lex m u ltip lic a tio n s an d N Iog2 N
a d d itio n s. Since th e F F T is p e rfo rm e d tw ice, o n ce fo r th e D F T a n d o n ce fo r th e
ID F T , th e c o m p u ta tio n a l b u rd e n is N log2 N co m p lex m u ltip lic a tio n s an d 2 N log2 N
a d d itio n s. T h e re a re also N co m p lex m u ltip lic a tio n s a n d N — 1 a d d itio n s re q u ire d
to c o m p u te ym(Jfc). T h e re fo re , w e h av e ( N \ o g 2 2 N ) / L co m p lex m u ltip lic a tio n s p er
o u tp u t d a ta p o in t a n d a p p ro x im a te ly ( 2 N \ o g 2 2 N ) / L a d d itio n s p e r o u tp u t d ata
p o in t. T h e o v e rla p -a d d m e th o d re q u ire s an in c re m e n ta l in c re a se o f ( M — \ ) / L in
th e n u m b e r o f ad d itio n s.
B y w ay o f c o m p a riso n , a d ire c t fo rm re a liz a tio n o f th e F I R filter involves M
real m u ltip lic atio n s p e r o u tp u t p o in t if th e filter is n o t lin e a r p h a s e , a n d M / 2 if it
is lin e a r p h ase (sy m m etric ). A lso , th e n u m b e r o f a d d itio n s is M - 1 p e r o u tp u t
p o in t (see Sec. 8.2).
I t is in te re stin g to co m p a re th e efficiency o f th e F F T a lg o rith m w ith th e direct
fo rm re a liz a tio n o f th e F IR filter. L e t us focus o n th e n u m b e r o f m ultip lic atio n s,
w h ich a re m o re tim e co n su m in g th a n a d d itio n s. S u p p o se th a t M = 128 = 27 an d
N = 2 V. T h e n th e n u m b e r o f co m p lex m u ltip lic a tio n s p e r o u tp u t p o in t fo r an F F T
size o f N = 2 V is
Sec. 6.3 A Linear Filtering Approach to Computation of the DFT 479
TABLE 6.3 COMPUTATIONAL COMPLEXITY
f(v)
Size of FFT Number of Complex Multiplications
i) —log2 N per Output Point
9 13.3
10 12.6
11 12.8
12 13.4
14 15.1
T h e v alu es o f c( v) fo r d iffe re n t v alu es o f i> are given in T a b le 6.3. W e o b se rv e
th a t th e re is an o p tim u m v a lu e o f i< w h ich m in im iz es c(u ). F o r th e F IR filter of
size M = 128, th e o p tim u m o ccu rs at d = 10.
W e sh o u ld e m p h asize th a t c ( f ) r e p re s e n ts th e n u m b e r o f co m p lex m u ltip lic a
tio n s fo r th e F F T -b a se d m e th o d . T h e n u m b e r o f re a l m u ltip lic a tio n s is fo u r tim es
th is n u m b e r. H o w e v e r, ev en if th e F IR filter has lin e a r p h a s e (see Sec. 8.2), th e
n u m b e r o f c o m p u ta tio n s p e r o u tp u t p o in t is still less w ith th e F F T -b a se d m eth o d .
F u rth e rm o r e , th e efficiency o f th e F F T m e th o d can be im p ro v e d by c o m p u tin g
th e D F T o f tw o successive d a ta b lo ck s sim u lta n e o u sly , ac c o rd in g to th e m eth o d
ju st d e sc rib e d . C o n s e q u e n tly , th e F F T -b a se d m e th o d is in d e e d su p e rio r from a
c o m p u ta tio n a l p o in t o f view w h en th e filter len g th is re lativ ely large.
T h e c o m p u ta tio n o f th e cross c o rre la tio n b e tw e e n tw o se q u e n c e s by m e a n s o f
th e F F T a lg o rith m is sim ilar to th e lin e a r F IR filtering p ro b le m ju st d esc rib e d . In
p ractical a p p lic a tio n s involving c ro ssc o rre la tio n , a t least o n e o f th e se q u e n c e s has
finite d u ra tio n an d is a k in to th e im p u lse re sp o n s e o f th e F IR filter. T h e seco n d
s e q u e n c e m ay be a lo n g se q u e n c e w hich c o n ta in s th e d e s ire d se q u e n c e c o rru p te d
b y a d d itiv e n o ise. H e n c e th e se co n d se q u e n c e is a k in to th e in p u t to th e F I R filter.
B y tim e rev e rsin g th e first se q u e n c e a n d co m p u tin g its D F T , w e h av e r e d u c e d th e
cro ss c o rre la tio n to an e q u iv a le n t co n v o lu tio n p ro b le m (i.e.. a lin e a r F I R filtering
p ro b le m ). T h e re fo re , th e m e th o d o lo g y w e d e v e lo p e d fo r lin e a r F IR filterin g by
u se o f th e F F T a p p lie s directly.
6.3 A LINEAR FILTERING APPROACH TO COMPUTATION OF THE
DFT
T h e F F T alg o rith m ta k e s N p o in ts o f in p u t d a ta a n d p ro d u c e s an o u tp u t se q u e n c e
o f N p o in ts c o rre sp o n d in g to th e D F T o f th e in p u t d a ta . A s w e h a v e show n,
th e rad ix -2 F F T a lg o rith m p e rfo rm s th e c o m p u ta tio n of th e D F T in ( N f l ) log2 N
m u ltip lic a tio n s a n d N log2 N a d d itio n s fo r a n N -p o in t se q u e n c e .
T h e re a re so m e a p p lic a tio n s w h e re o n ly a se le c te d n u m b e r o f valu es o f
th e D F T a re d e s ire d , b u t th e e n tire D F T is n o t re q u ire d . In such a case, th e
F F T a lg o rith m m ay n o lo n g e r be m o r e efficien t th a n a d ire c t c o m p u ta tio n o f
th e d e s ire d v alu es o f th e D F T . In fact, w h e n th e d e s ire d n u m b e r o f valu es o f
480 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
th e D F T is less th a n log2 N , a d ire c t c o m p u ta tio n o f th e d e s ire d v alu es is m o re
efficient.
T h e d irect c o m p u ta tio n o f th e D F T can b e fo rm u la te d as a lin e a r filtering
o p e ra tio n o n th e in p u t d a ta se q u en ce. A s w e will d e m o n s tra te , th e lin e a r filter
ta k e s th e fo rm o f a p a ra lle l b a n k o f re so n a to rs w h e re ea c h r e s o n a to r se lects o n e
o f th e fre q u e n c ie s a>k = 2 n k / N , k = 0, 1 , . . . , N — 1, c o rre sp o n d in g to th e N
fre q u e n c ie s in th e D F T .
T h e re a re o th e r a p p lic a tio n s in w hich w e re q u ire th e e v a lu a tio n o f th e z-
tra n sfo rm o f a fin ite -d u ra tio n se q u e n c e a t p o in ts o th e r th a n th e u n it circle. If
th e se t o f d e sire d p o in ts in th e z-p lan e po ssesses so m e re g u la rity , it is possible
to also ex p ress th e c o m p u ta tio n o f th e z -tra n s fo rm a s a lin e a r filte rin g o p e ra tio n .
In th is c o n n e c tio n , w e in tro d u c e a n o th e r alg o rith m , called th e c h irp -z tra n sfo rm
alg o rith m , w hich is su ita b le fo r e v a lu a tin g th e z -tra n s fo rm o f a se t o f d a ta o n a
v ariety o f c o n to u rs in th e z-p lan e. T h is alg o rith m is also fo rm u la te d as a lin ear
filtering o f a set o f in p u t d a ta . A s a co n se q u e n c e , th e F F T a lg o rith m can b e used
to c o m p u te th e ch irp -z tra n sfo rm a n d th u s to e v a lu a te th e z -tra n s fo rm at various
c o n to u rs in th e z -p la n e , in clu d in g th e u n it circle.
6.3.1 The Goertzel Algorithm
T h e G o e rtz e l a lg o rith m ex p lo its th e p erio d icity o f th e p h ase fa c to rs {W£} an d
allow s us to ex p re ss th e c o m p u ta tio n o f th e D F T as a lin e a r filterin g o p e ra tio n .
Since W # kN = 1, w e can m u ltip ly th e D F T by th is fa c to r. T h u s
(6.3.1)
W e n o te th a t (6.3.1) is in th e fo rm of a c o n v o lu tio n . In d e e d , if w e d efin e the
se q u e n c e yk(n) as
(6.3.2)
m=0
th e n it is c le a r th a t » ( n ) is th e co n v o lu tio n o f th e fin ite -d u ra tio n in p u t se q u en ce
x( n ) o f len g th N w ith a filter th a t h as an im pulse re sp o n s e
h k(n) = W ~ knu ( n ) (6.3.3)
T h e o u tp u t o f th is filter a t n = N y ields th e v alu e o f th e D F T a t th e freq u e n cy
an = h r k / N . T h a t is,
X ( k ) = >*(n)|n=JV (6.3.4)
as can b e verified b y c o m p a rin g (6.3.1) w ith (6.3.2).
T h e filter w ith im p u lse re s p o n s e h k (n) h a s th e sy stem fu n c tio n
(6.3.5)
Sec. 6.3 A Linear Filtering Approach to Computation of the DFT 481
T h is filter h as a p o le o n th e u n it circle a t th e fre q u e n c y cd* = 2n k / N . T h u s, the
e n tire D F T can b e c o m p u te d by passin g th e block o f in p u t d a ta in to a p a ra l
lel b a n k o f N sin g le-p o le filters (re s o n a to rs), w h ere each filter h as a p o le at the
c o rre sp o n d in g fre q u e n c y o f th e D F T .
I n s te a d o f p e rfo rm in g th e c o m p u ta tio n o f th e D F T as in (6.3.2), via co n v o lu
tio n , w e can use th e d iffe re n c e e q u a tio n c o rre sp o n d in g to th e filter given by (6.3.5)
to c o m p u te y k(ir) recu rsiv ely . T h u s we h av e
y t (n) = W ^ kyt ( n - 1) + x i n ) V i-(-l) = 0 (6.3.6)
T h e d e sire d o u tp u t is X ( k ) = y k( N) , fo r k = 0, 1 , . . . , N — 1. T o p e rfo rm this
c o m p u ta tio n , w e can c o m p u te o n ce a n d sto re th e p h a s e facto rs W # k.
T h e co m p lex m u ltip lic a tio n s an d a d d itio n s in h e re n t in (6.3.6) can be av o id ed
by co m b in in g th e p airs o f re so n a to rs p o ssessin g c o m p le x -c o n ju g a te p oles. T his
lead s to tw o -p o le filters w ith system fu n c tio n s o f th e form
] _ iy*
HkL ) ~ 1 - 2 c o s ( 2 t i k / N ) : ~ l + C ' 2 (6'3 '7)
T h e d irect form II re a liz a tio n o f th e system illu stra te d in Fig. 6.17 is d e sc rib e d by
th e d iffe re n c e e q u a tio n
2:rk
v k(n) = 2 cos — v*.(zi — 1) - vk(n - 2) + x( i t ) (6.3.8)
N
Vi(h) = vk in) - W N
k vk (n - 1) (6.3.9)
w ith in itial c o n d itio n s iv - ( - l) = vk{ - 2 ) = 0.
T h e recu rsiv e re la tio n in (6.3.8) is ite ra te d for n = 0, 1.........N , b u t th e e q u a
tio n in (6.3.9) is c o m p u te d o n ly o n ce a t tim e n = N. E ach ite ra tio n re q u ire s o n e
real m u ltip lic a tio n a n d tw o a d d itio n s. C o n se q u e n tly , fo r a re a l in p u t se q u e n c e
x ( n) . th is a lg o rith m re q u ire s N + 1 re a l m u ltip lic a tio n s to yield n o t o n ly X ( k ) b ut
also, d u e to sy m m etry , th e v a lu e o f X ( N — k).
T h e G o e rtz e l alg o rith m is p a rtic u la rly a ttra c tiv e w h en th e D F T is to b e c o m
p u te d at a re lativ ely sm all n u m b e r M o f values, w h e re M < Iog2 N . O th erw ise,
th e F F T a lg o rith m is a m o re efficient m e th o d .
Figure 6.17 Direct form It realization
of two-pole resonator for computing the
DFT.
482 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
6.3.2 The Chirp-z Transform Algorithm
T he D F T o f an W -point data seq u en ce x(n ) has b een view ed as the z-transform
o f x i n ) evaluated at N equally spaced p oin ts on the unit circle. It has also been
view ed as N equally spaced sam ples o f the Fourier transform o f th e data sequ en ce
x (n ). In this section w e consider the evaluation o f X ( z ) on other contours in the
z-plane, including th e unit circle.
S u ppose that w e wish to com p ute the values o f the z-transform o f jc(n) at a
set o f p oints {z*}. T hen,
A '- l
X i z k) = J 2 x ( n ) z r * = 0 , 1 .........L - 1 (6.3.10)
n=0
For exam ple, if the contour is a circle o f radius r and the z* are N equally spaced
points, then
Zk = r e j 2”*"/" k = 0, 1,2 ..... N - 1
2=1 (6.3.11)
X ( z k) = J 2 i x M r ~n}e n/N k = 0 , 1 , 2 .........N - 1
n=0
In this case the FFT algorithm can be applied on the m odified seq u en ce x { n ) r ~ n.
M ore generally, suppose that the p oin ts z* in the z-plane fall on an arc which
begins at som e point
Zo = r0eJlk’
and spirals either in toward the origin or out away from the origin such that the
points are defined as
zk = rQe je°(Roei *‘)i k = 0 ,1 ,..., L - 1 (6.3.12)
N o te that if R0 < 1, the points fall on a con tour that spirals tow ard th e origin and if
R0 > 1, the contour spirals away from the origin. If Ro — 1, the con tou r is a circular
arc o f radius ro. If r0 = 1 and Ro = l , the con tour is an arc o f th e unit circle. The
latter contour w ould allow us to com p ute the frequency con ten t o f the sequence
x ( n ) at a dense set o f L freq u en cies in the range covered by the arc w ithout having
to com pute a large D F T , that is, a D F T o f the seq u en ce x ( n ) pad d ed with many
zeros to obtain the desired resolution in frequency. Finally, if r0 = Ro = 1, = 0,
0o = 2n / N , and L = N , the contour is the entire unit circle and the frequencies
are those o f the D F T . T h e various contours are illustrated in Fig. 6.18.
W hen points {z*J in (6.3.12) are substituted in to the exp ression for the z-
transform, w e obtain
* ( z t ) = X ! -* ( ” > z r i
n=0
n=° (6.3.13)
N-1
= j > ( n ) ( r 0e j * ) ~ ”V
Sec. 6.3 A Linear Filtering Approach to Computation of the DFT 483
lm(r) ImU)
n=0
lm (;l Im(;)
Figure 6.18 Some examples of contours on which we may evaluate the z-
transform.
w here, by definition.
V = R veJ^ (6.3.14)
W e can exp ress (6.3.13) in the form o f a con volu tion , by n oting that
nk = j[n 2 + k 2 — (k — n) 2] (6.3.15)
Substitution o f (6.3.15) into (6.3.13) yield s
N- 1
X(C*) = V - l ' Z /2 J 2 [ x ( n ) ( r 0eJlk' ) - nV - n2f2] V (k- n)2fZ (6.3.16)
Let us define a n ew seq u en ce g ( n ) as
g (n) = x ( n )( r ^ e j<hr n V - n^ (6.3.17)
T h en (6.3.16) can b e exp ressed as
X ( z k) = V - k2/2y g ( n ) V {k-',)1/2 (6.3.18)
484 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
T he sum m ation in (6.3.18) can be interpreted as the con volu tion o f the sequ en ce
g (n) with the im pulse resp onse h (n) o f a filter, where
h(n) = V n2/2 (6.3.19)
C onsequently, (6.3.18) m ay b e expressed as
X(zt) = V -*Vy(k)
(6.3.20)
w here y(Jt) is the ou tp u t o f the filter
s —1
y ( k ) = Y ' g ( n ) h ( k — n) k = 0. 1.........L — 1 (6.3.21)
n=0
W e observe that b oth h (n) and g(n) are com p lex-valu ed seq u en ces.
T he sequ en ce h (n) with R 0 = 1 has the form o f a com plex exp on en tial with
argum ent (on = n 2<(>o /2 = (n 0 o /2 )n. T h e quantity rep resents the frequency
o f the com plex exp on en tial signal, which increases linearly with tim e. Such signals
are used in radar system s and are called chi rp signals. H en ce th e z-transform
evaluated as in (6.3.18) is called the chi rp-z t ransf orm.
T h e linear con volu tion in (6.3.21) is m ost efficien tly d on e by use o f the FFT
algorithm . T he seq u en ce g( n) is o f length N . H ow ever, h{n) has infinite du
ration. Fortunately, only a portion h{n) is required to co m p u te the L values
o f X (z).
Since w e will com p ute the con volu tion in (6.3.1) via the F FT, let us consider
the circular con volu tion o f the W-point seq u en ce g{n) with an M -p oint section of
/i(n), w here M > N . In such a case, w e k n ow that the first N — 1 p oin ts contain
aliasing and that the rem aining M — N + 1 p oints are identical to the result that
would b e obtained from a linear con volu tion o f h( n) with g(n). In view o f this, we
should select a D F T o f size
M - L + N - 1
which would yield L valid p oin ts and N - 1 points corrupted by aliasing.
T he section o f h(n) that is n eed ed for this com putation corresp on d s to the
values o f h{ri) for —( N - 1) < n < (L — 1), which is o f length M = L + N — 1, as
observed from (6.3.21). Let us define the seq u en ce h \{n ) o f length M as
/ii(n ) = h(n — N -f 1) n — 0 , 1 .........M — 1 (6.3.22)
and com p ute its Af-poin t D F T via the FFT algorithm to obtain H \ ( k ) . F rom x (n )
w e com p ute g ( n ) as specified by (6.3.17), pad g(n ) w ith L — 1 zeros, and com
pute its Af-point D F T to yield G(Jfc). T h e ID F T o f th e product y i(* ) = G ( k ) H \( k )
yields the Af-point seq u en ce > i(n ), n = 0, 1 , . . . , Af — 1. T h e first N — 1 p oints of
y i(« ) are corrupted by aliasing and are discarded. T h e desired valu es are yi(n)
f o r N — 1 < n < M — 1, w hich correspond to the range 0 < n < L — l i n (6.3.21),
Sec. 6.3 A Linear Filtering Approach to Computation of the DFT 485
that is,
y(n) = y t ( n + N — 1) n = 0, 1.........L — 1 (6.3.23)
A ltern atively, w e can define a seq u en ce ft 2 (n) as
h 2(n) =
h (n), 0 < n < L —1
h ( n ~ N - L + l), L < ti < M — 1 (6.3.24)
The A f-point D F T o f h 2{n) yields H2(k), which w hen m ultiplied by G( k ) yields
Y2(k) = G( k ) Hz ( k ) . T he ID F T o f Y2(k) yield s the seq u en ce y2( n) for 0 < n < A f - 1 .
N o w the desired valu es o f >’2 (") are in the range 0 < n < L — 1, that is,
y ( n ) = y 2(n) n = 0, 1 , . . . , L — 1 (6.3.25)
Finally, the com p lex valu es X(Zi) are com p uted by dividing y( k) by h ( k ),
k = 0, 1.........L — 1, as specified by (6.3.20).
In gen eral, the com p utational com p lexity o f the chirp-z transform algorithm
described ab ove is o f the order of Af log 2 M com plex m ultiplications, where M =
N + L ~ 1. T h is num ber should be com pared with the product, N ■L, the num ber
o f com p utations required by direct evaluation o f the z-transform . Clearly, if L is
sm all, direct com p utation is m ore efficient. H ow ever, if L is large, then the chirp-z
transform algorithm is m ore efficient.
T h e chirp-z transform m eth od has b een im plem ented in hardware to com pute
the D F T o f signals. For the com putation of the D FT, w e select ro = /?(i = 1, 6\j = 0,
</>o = 2n / N , and L = N. In this case
y-ir/2 _ e -jjin -/N
nn2 . Tin2 <6 '3 '26 >
= c o s --------- j s i n -------
N N
T he chirp filter with im pulse response
h( n) = V nlfl
t2 . nn2
— (- j/ ssin
= c o s -------- i n -----
— (6.3.27)
N N
= h r(n) + jh , { n )
has b een im p lem en ted as a pair o f F IR filters w ith coefficients h r (n) and A,(n),
resp ectively. B o th su rface acous tic w av e (SAW ) d evices and charge co u p l e d d e
vices (C C D ) h ave b een u sed in practice for the F IR filters. T h e cosine and sine
seq u en ces given in (6.3.26) n eed ed for the prem ultiplications and postm ultiplica
tion s are usually stored in a read-only m em ory (R O M ). Furtherm ore, w e n ote that
if o n ly the m agnitu d e o f the D F T is desired, the postm ultiplications are u n n eces
sary. In this case,
|X (z*)l = \y(k)\ k = 0 ,1 ,..., n -1 (6.3.28)
as illustrated in Fig. 6.19. T hus the linear F IR filtering approach using th e chirp-z
transform has b een im p lem en ted for the com putation o f the D F T .
486 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
Chirp Fillers
Figure 6.19 Block diagram illustrating the implementation of the chirp-z transform for com
puting the DFT (magnitude only).
6.4 QUANTIZATION EFFECTS IN THE COMPUTATION OF THE DFT*
A s w e have ob served in our p reviou s discussions, the D F T plays an im portant role
in m any digital signal p rocessing applications, including F IR filtering, the com pu
tation o f the correlation betw een signals, and spectral analysis. For this reason
it is im portant for us to kn ow th e effect o f quantization errors in its com puta
tion. In particular, w e shall consider the effect o f rou n d -off errors due to the
m ultiplications perform ed in the D F T with fixed-point arithm etic.
T h e m odel that w e shall adopt for characterizing rou n d -off errors in m ulti
plication is the additive w hite n oise m o d el that w e use in the statistical analysis
o f Tound-off errors in IIR and F IR filters (see Fig. 7.34). A lth ou gh the statistical
*It is recommended that the reader review Section 7.5 prior to reading this section.
Sec. 6.4 Quantization Effects in the Computation of the DFT 487
analysis is perform ed for rounding, the analysis can be easily m odified to apply to
truncation in tw o's-com p lem en t arithm etic (see Sec. 7.5.3).
O f particular interest is the analysis o f rou n d -off errors in the com putation
o f the D F T via the FFT algorithm . H ow ever, w e shall first establish a benchm ark
by determ ining the round-off errors in the direct com p utation o f the D F T .
6.4.1 Quantization Errors in the Direct Computation of
the DFT
G iven a finite-duration seq u en ce (jt(n)], 0 < n < N — 1, the D F T o f {jc(h)1 is
defined as
A/-1
* (* ) = Y l x ( n ) w "' £ = 0 , 1 ........ N - 1 (6.4.1)
j,=0
w here IVyv = c ~ )2r,/N. W e assum e that in general, {*(«)] is a com p lex-valu ed se
quence. W e also assum e that the real and im aginary com p on en ts o f {a (h)I and
{VV^"] are represented by b bits. C onsequently, the com putation o f the product
requires four real m ultiplications. Each real m ultiplication is rounded
from 2b bits to b bits, and hence there are four quantization errors for each
com p lex-valu ed m ultiplication.
In the direct com putation o f the D F T , there are N com p lex-valu ed m ultiplica
tions for each point in the D FT. T herefore, the total num ber o f real m ultiplications
in the com putation o f a single point in the D F T is 4 N. C on sequ en tly, there are
4 N quantization errors.
Let us evaluate the variance o f the quantization errors in a fixed-point com
putation o f the D F T . First, w e m ake the follow in g assum ptions about the statistical
properties o f the quantization errors.
1. T h e quantization errors due to rounding are uniform ly distributed random
variables in the range (—A /2 , A /2 ) where A = 2~ b.
2. T h e 4 N quantization errors are m utually uncorrelated.
3. T h e 4 N quantization errors are uncorrelated with the seq u en ce |jc{«}}.
Since each o f the quantization errors has a variance
A 2 7~2b
" ' = 1 2 = 1 2 <6A2>
the variance o f the quantization errors from the 4 N m ultiplications is
<r2 = 4 N o ]
(6A3)
3
H en ce the variance o f the quantization error is p roportional to the size o f D FT.
N o te that w hen Af is a p ow er o f 2 (i.e., N = 2 1’), the variance can be expressed
488 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
2 —2(h—1*/2»
a ] = ------------- (6.4.4)
This expression im plies that every fourfold increase in the size N o f the D F T
requires an additional bit in com putational precision to offset the additional quan
tization errors.
T o prevent overflow , the input seq u en ce to the D F T requires scaling. Clearly,
an upper bound on | X (A:) | is
A '- l
[* (£)! < Y |*(n )| (6.4.5)
n=0
If the dynam ic range in addition is ( - 1 , 1 ) , then |X (/:)| < 1 requires that
A '-l
Y |jr(/i)| < 1 (6.4.6)
n=0
If U (/i)| is initially scaled such that |a (/j)| < 1 for all n, then each point in the
seq u en ce can be divided by N to ensure that (6.4.6) is satisfied.
T h e scaling im plied by (6.4.6) is extrem ely severe. For exam p le, su p p ose
that the signal seq u en ce {*(«)} is white and. after scaling, each valu e |.r(n)l o f the
seq u en ce is uniform ly distributed in the range (-1 /7 V , I/ N ) . T h en the variance of
the signal sequ en ce is
o ,2 = ----------
<2 / * > 2 = —1 - tzA-n
(6.4.7)
12 3N1 >
and the variance o f the output D F T coefficients |Jf(/t)l is
al = N a2
1 (6.4.8)
~ 3 ~N
Thus the signal-to-noise p ow er ratio is
(6.4.9)
W e observe that the scaling is responsible for reducing th e S N R by N and
the com bination o f scaling and quantization errors result in a total reduction that
is proportional to N 2. H en ce scaling the input seq u en ce (j(n )} to satisfy (6.4.6)
im poses a severe p en alty on the signal-to-noise ratio in the D F T .
Exam ple 6.4.1
Use (6.4.9) to determ ine the num ber of bits required to com pute the D FT of a 1024-
point sequence with a SNR of 30 dB.
Solution The size of the sequence is N = 210. Hence the SNR is
Sec. 6.4 Quantization Effects in the Computation of the DFT 489
For an SNR o f 30 dB, we have
3(2* - 20) = 30
b = 15 bits
N ote that the 15 bits is the precision for both multiplication and addition.
Instead o f scaling the input sequ en ce {Jt(n)}, suppose w e sim ply require that
|x(n)l < 1. T h en w e m ust provide a sufficiently large dynam ic range for addition
such that |* ( * ) l < N . In such a case, the variance o f the seq u en ce {|jc(n)|) is
a 2 = 5 , and h en ce th e variance o f |X (* )| is
(6.4.10)
C on sequ en tly, the S N R is
(6.4.11)
If w e repeat the com putation in E xam ple 6.4,1, w e find that the num ber o f
bits required to a ch ieve a S N R o f 30 dB is b = 5 bits. H ow ever, w e n eed an
additional 1 0 bits for the accum ulator (th e adder) to accom m odate the increase
in the dynam ic range for addition. A lthou gh w e did not ach ieve any reduction
in the dynam ic range for addition, we have m anaged to reduce the p recision in
m ultiplication from 15 bits to 5 bits, which is highly significant.
6.4.2 Quantization Errors in FFT Algorithms
A s w e have sh ow n , the F F T algorithm s require significantly few er m ultiplications
than the direct com p utation o f the D F T . In view o f this w e m ight con clu d e that the
com p utation o f the D F T via an FFT algorithm w ill result in sm aller quantization
errors. U n fortu n ately, that is n ot the case, as w e will dem onstrate.
L et us con sid er the use o f fixed-point arithm etic in th e com putation o f a
radix-2 F F T algorithm . T o be specific, w e select the radix-2, decim ation-in-tim e
algorithm illustrated in Fig. 6.20 for the case N = S. T h e results on quantiza
tion errors that w e ob tain for this radix-2 FFT algorithm are typical o f th e results
o b ta in ed w ith o th er radix - 2 and higher radix algorithm s.
W e o b serv e that each butterfly com putation in volves o n e com plex-valued
m ultiplication or, eq u ivalen tly, four real m ultiplications. W e ignore the fact that
so m e butterflies con tain a trivial m ultiplication by ± 1 . If w e consider th e but
terflies that affect the com p utation o f any on e valu e o f the D F T , w e find that,
in gen eral, there are N /2 in the first stage of the FFT , N / 4 in the secon d stage,
N / 8 in the third state, and so on , until the last stage, w here there is on ly on e.
C on sequ en tly, th e num ber o f butterflies per output point is
2 " - '+ 2 " - 2 + --- + 2 + l = 2v“ ‘ [ l + ( ! ) + ■ • + ( j ) ” ']
= 2 ”[ l - ( j n = W -l
490 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
Stage 1 Stage 2 Stage 3
For exam ple, the butterflies that affect the com p utation o f A"(3) in the eight-point
FFT algorithm o f Fig. 6.20 are illustrated in Fig. 6.21.
T h e quantization errors introduced in each butterfly propagate to the output.
N o te that the quantization errors introduced in the first stage p ropagate through
(v - 1 ) stages, th ose introduced in the second stage propagate through (v - 2 )
stages, and so on. A s these quantization errors propagate through a num ber of
su bsequent stages, th ey are phase shifted (ph ase rotated) by th e phase factors
W^n. T h ese phase rotations do not change the statistical p rop erties o f the quan
tization errors and, in particular, the variance o f each q uantization error remains
invariant.
If w e assum e that the quantization errors in each butterfly are uncorrelated
with the errors in other butterflies, then there are 4(W - 1 ) errors that affect the
output o f each point o f the FFT. C on sequ en tly, th e variance o f the total quanti
zation error at the output is
A 2 A 2
f f| = 4 (A r- ! ) — « — (6.4.13)
Sec. 6.4 Quantization Effects in the Computation of the DFT 491
Figure 6.21 Butterflies that affect the computation o f X (3).
w here A = 2 h. H en ce
a2= j ■2"“ (6.4.14)
T his is exactly the sam e result that w e ob tain ed for the direct com p utation o f the
DFT.
T h e result in (6.4.14) should n ot b e surprising. In fact, the FFT algorithm
d o es not reduce the num ber o f m ultiplications required to com p ute a single point
o f the D F T . It d o es, h ow ever, exp loit the p eriod icities in W^n and thus reduces
the num ber o f m ultiplications in the com p utation o f the entire block o f N points
in the D F T .
A s in the case o f the direct com p utation o f the D F T , w e m ust scale the
input seq u en ce to prevent overflow . R ecall that if |jc(n) | < \ / N , 0 < n < N —
1, then |X (* )| < 1 for 0 < k < N — 1. T hus overflow is avoided. W ith this
scaling, the relation s in (6.4.7), (6.4.8), and (6.4.9), ob tain ed previously for the
direct com p utation o f the D F T , apply to the F F T algorithm as w ell. C onsequently,
the sam e S N R is o b tain ed for the FFT.
Since the F F T algorithm consists o f a seq u en ce o f stages, w h ere each stage
con tains butterflies that in volve pairs o f points, it is p ossib le to d evise a differ
en t scaling strategy that is n ot as severe as dividing each input p oin t by N . This
492 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
alternative scaling strategy is m otivated by the observation that the in term edi
ate values [Xn(/r)| in the n = 1, 2,..., u stages o f th e F F T algorithm satisfy the
conditions (see P roblem 6.35)
m ax[|X n+1 ( * ) U X n+1(/)|] > m ax[|X n( * ) U X B( 0 |]
(6.4.15)
m ax [|X n+1 ( * ) |,|X B+1(/)|] < 2 m ax[jX „(Jt)|,|X n(/)|]
In view o f th ese relations, w e can distribute the total scalin g o f 1 / N in to each
o f the stages o f the F F T algorithm . In particular, if |jr(n)| < 1, w e apply a scale
factor o f 5 in the first stage so that |jr(n)| < T h en the output o f each subsequent
stage in the FFT algorithm is scaled by | , so that after v stages w e have achieved
an overall scale factor o f ( j ) 1' = 1 //V. Thus overflow in the com p utation o f the
D F T is avoided.
This scaling procedure d o es not affect the signal level at the output o f the
FFT algorithm , but it significantly reduces the variance o f the quantization errors
at the output. Specifically, each factor o f ^ reduces the variance o f a quantization
error term by a factor o f Thus the 4 ( N / 2 ) quantization errors introduced in
the first stage are reduced in variance by (^ V - 1 , the 4 ( N / 4 ) quantization errors
introduced in the second stage are reduced in variance by ( j ) 1’- 2 . and so on. C on
sequ en tly, the total variance o f the quantization errors at the output o f the FFT
algorithm is
w here the factor (^ )tJ is negligible.
W e now o b serve that (6.4.16) is n o longer proportional t o N . O n th e other
hand, the signal has th e variance a \ = 1 /3 N , as given in (6.4.8). H e n c e the S N R is
f l = _ L .2 ^
2N (6.4.17)
_ 22b—v—\
Thus, by distributing th e scaling o f l / N uniform ly throughout th e FFT algorithm ,
w e have achieved an S N R that is inversely proportional to N in stead o f N 2.
Example 6.4.2
Determine the number of bits required to compute an FFT of 1024 points with an
SNR of 30 dB when the scaling is distributed as described above.
Sec. 6.5 Summary and References 493
Solution The size of the FFT is N = 210. Hence the SNR according to (6.4.17) is
101°gio 22h~v~l = 30
3(2b - 11) = 30
b = bits)
This can be com pared with the 15 bits required if all the scaling is perform ed in the
first stage of the FFT algorithm.
6.5 SUMMARY AND REFERENCES
T h e fo cu s o f this chapter w as on the efficien t com putation o f the D F T . W e d em on
strated that by taking advantage o f the sym m etry and p eriodicity p roperties o f the
ex p on en tial factors W#", w e can reduce the num ber o f com p lex m ultiplications
n eed ed to com p ute the D F T from N 2 to N log 2 N w hen Af is a p ow er o f 2. A s w e
indicated, any seq u en ce can be augm ented with zeros, such that N — 2''.
For d ecad es, FFT -type algorithm s were o f interest to m athem aticians w ho
w ere con cern ed w ith com p utin g values o f F ourier series by hand. H ow ever, it
w as not until C o o ley and T u k ey (1965) published their w ell-k now n paper that the
im pact and significance o f the efficient com putation o f the D F T was recognized.
Since then the C o o le y -T u k e y FFT algorithm and its various form s, for exam ple,
the algorithm s o f S in gleton (1967, 1969), have had a trem en dou s influence on the
use o f the D F T in con v olu tion , correlation, and spectrum analysis. For a historical
p erspective on the F FT algorithm , the reader is referred to the paper by C ooley
et al. (1967).
T h e split-radix FFT (SR F F T ) algorithm d escribed in Section 9.3.5 is due
to D u h a m el and H ollm an n (1 9 8 4 ,1 9 8 6 ). T he “m irror” F F T (M FF T ) and “p h ase”
F F T (PF FT ) algorithm s w ere described to the authors by R. Price. T he exp loitation
o f sym m etry p rop erties in the data to reduce the com putation tim e are described
in a paper by Sw arztrauber (1986).
O ver the years, a num ber o f tutorial papers have b een published on FFT
algorithm s. W e cite the early papers by Brigham and M orrow (1967), Cochran et
al. (1967), B ergland (1969), and C ooley et al. (1967, 1969).
T h e reco g n itio n that the D F T can b e arranged and com p uted as a linear
con volu tion is also highly significant. G o ertzel (1968) indicated that the D F T
can b e com p uted via linear filtering, although the com p utational savings o f this
approach is rath er m odest, as w e have observed. M ore significant is the work
o f B lu estein (1 9 7 0 ), w ho d em onstrated that the com p utation o f the D F T can be
form u lated as a chirp linear filtering operation. T his w ork led to the d evelop m en t
o f th e chirp-z transform algorithm by R ab in er et al. (1969).
In addition to the F F T algorithm s describ ed in this chapter, there are other
efficien t algorithm s for com p utin g the D F T , som e o f w hich further reduce the
494 Efficient Computation of the DFT: Fast Fourier Transform Algorithms Chap. 6
num ber o f m ultiplications, but usually require m ore additions. O f particular im
portance is an algorithm due to R ader and B renner (1976), the class o f prim e factor
algorithm s, such as the G o o d algorithm (1971), and the W inograd algorithm (1976,
1978). For a d escription o f these and related algorithm s, the reader m ay refer to
the text by Blahut (1985).
PROBLEMS
6.1 Show that each of the numbers
eja*/NH o < /t < W - 1
corresponds to an Wth root of unity. Plot these numbers as phasors in the complex
plane and illustrate, by m eans of this figure, the orthogonality property
-ja ir/N M n _ | N, if k ~ 1
I 0, if k ^ l
n=<) I 1
6.2 (a) Show that the phase factors can be com puted recursively by
W$ =
(b) Perform this com putation once using single-precision floating-point arithmetic
and once using oniy four significant digits. Note the deterioration due to the
accumulation of round-off errors in the later case.
(c) Show how the results in part (b) can be improved by resetting the result to the
correct value - j . each time gl = N/4.
6 3 Let x(n) be a real-valued N -point (N = 2' ) sequence. Develop a method to compute
an N -point D FT X ’(k), which contains only the odd harmonics [i.e., X'(k) = 0 if Jt is
even] by using only a real A,/2-spoint DFT.
6.4 A designer has available a num ber of eight-point FFT chips. Show explicitly how he
should interconnect three such chips in order to com pute a 24-point DFT.
6.5 The ^-transform of the sequence x(n) = u(n) - u(n - 7) is sampled at five points on
the unit circle as follows
x(k) = X(z) 1- = eJ'2jr*/5 Jt* 0,1 ,2 ,3 ,4
Determ ine the inverse D FT x'(n) of X (Jt). Com pare it with *(/t) and explain the
results.
6.6 Consider a finite-duration sequence x(n), 0 < n < 7, with z-transform X(z). We wish
to com pute X (:) at the following set of values:
zk = 0.8ejf|(2)r*/*,+(’' /8)] 0 < Jfc < 7
(a) Sketch the points {z*} in the complex plane.
D eterm ine a sequence s ( n ) such that its D F T provides the desired samples of
( b )
*(z).