# Discrete cosine transform algorithms for FPGA devices

[PDF] [PS] [PS.BZ2] [VIEW]

**Bibtex:**@mastersthesis{babic03,
author = {Domagoj Babi\'c},
title = {{Discrete cosine transform algorithms for FPGA devices}},
school = {Faculty of Electrical Engineering and Computing of Zagreb
University},
year = {2003},
}

**Summary:**
Polynomial transform was presented as an efficient way to map multidimensional
transforms (like DCT and DFT) and convolutions to onedimensional
ones. Resulting algorithms have significantly lower computational
complexity, especially for convolutions and DCT. If optimal one-dimensional
DCT would be available, it would be possible to obtain optimal
multidimensional algorithms by using polynomial transforms.
As demonstrated in the thesis, polynomial transform based DCT can be
implemented in significantly less logic resources than row-column distributed
arithmetic algorithm. Polynomial transform computation is performed as the
series of additions and subtractions, so the finite-register length effects can
be avoided resulting in higher accuracy.
Second dimension of distributed arithmetic DCT implementation was
more problematic due to larger input word length. The resulting longer
carry chain, especially the one in the large accumulators at the output, was
the major obstacle to achieving higher operating frequency. The implementation
of polynomial transform DCT avoids this obstacle by using a butterfly
of adders and no accumulators are needed at the output.
The accuracy of both implementations has been extensively measured and
detailed results were reported, more detailed then available in the previous
work. Once the strong foundations for comparison were built, other factors,
like resource usage, maximum achievable frequency, latency and throughput
have been compared.
Another contribution of the thesis is the proposal of designing the butterfly
stage of such highly complex algorithms by scheduling the computation
from the last stage, where the designer has no flexibility. This approach has
yielded a very efficient FPGA implementation, superior to others reported.