Gimp Alpha Matting
Gimp Alpha Matting
Gimp is a freely available open source software for image Since Gimp is implemented in C, the code runs very
manipulation. Though with much functionality the efficiently and there are many ways on how to optimize
software still misses a decent foreground selection tool. the code. However, working on such low level code
The alpha matting algorithm described by Eduardo S. imparts problems like hard to find bugs and more
L. Gastal and Manuel M. Oliveira [2] promised to be the complicated ways of solving problems. Additionally we
right choice for such a tool, having a good performance had to familiarize ourselves with many different libraries
and accuracy. We show that it has clear advantages over that are used, like Gtk, Glib, and the Gimp API itself,
the current foreground-extraction tool in gimp, that that has little to no documentation.
uses an older method called Siox [1].
To be able to properly test and debug the code,
The algorithm is characterized by being very fast we added many different debug features, mostly in the
with real time calculation capabilities. This is done by form of defines that can be disabled. For example, we
restricting most of the computations on a localized area wrote routines to save the content of the tile cache to a
and furthermore being able to do the computations in .ppm file, or to load and save the user selection from/to
parallel on a GPU. Since Gimp only supports compu- a file. That means, during testing, one doesn’t have to
tations on a GPU in a special development branch, we draw the selection again and again. It is also possible
decided to leave out these optimizations for the sake of to not only get the final image as a result, but also the
providing the tool to a broader audience. estimated foreground, background or alpha value alone.
Finally, we are able to stop the algorithm after each of
the four steps to see the in-between results.
2 Algorithm
Additionally, we used various tools (like memory
2.1 Working with Gimp leak checkers and profilers) to improve speed and
2.1.1 Data Layout correctness of our implementation.
1
images as a hash key and chained the entries together
Once we’ve got the values - possibly four values in a linked list. This speeds up the algorithm greatly
each for foreground and background - we will find by avoiding unnecessary access to already known pixels
the best foreground/background pair to model our and thus prevents many tiles from being loaded into the
pixel color. This is done with an objective function, cache. It also reduces the memory needed to process
taking into account the distance of the found pixels, the the pixels, compared to having whole image-layers for
fluctuations of the color gradient on the way of finding storing the intermediate results. However, frequently
these pixels, and the chromatic distortion around the retrieving hash entries proved to be inefficient. Using a
pixel in question. profiler we discovered that this amounted for up to 80%
of the time for the computations!
After we’ve found a good combination of fore- To solve this problem we introduced a second, much
ground/background colors for each unknown pixel smaller cache for the hash entries that is used during
we go through the entries again. This time we search processing. After the cache has been processed, the data
for the three best color combinations for a pixel in its is written back to the hash table.
neighbourhood. Depending on the variance we take
an average of these three values for the foreground An optimisation we made as opposed to the origi-
and/or background or stay with the original value as an nal algorithm is when accessing neighbors of a pixel.
estimate. The method described in the paper was always to search
and compare other pixels in a certain radius. Instead
Following up on this we retrieve the final back- we often did the comparisons in a square around the
ground/foreground colors by performing a gaussian, original pixel, making the access pattern easier and
weighted relative to the alpha value and the confidence using the caches in a better way. The only downside is
of the colors. We then calculate the final confidence and that a neighbor in the corner of the square might be
use this result to calculate the final alpha value. selected instead of a spatially closer neighbor along the
edge in certain cases.
A few specific details of the algorithm as described
in the paper were not entirely clear to us. Therefore, Another difference to the paper is the range of the
we contacted the authors of the paper. Eduardo Gastal search for foreground/background pixels in the four
was kind enough to provide the clarifications we needed, directions. In the paper the step size is 6 pixels with a
so that we were able to implement the algorithm in all maximum number of 300 steps, amounting to a maximal
details. distance of 1800 pixels for a search in one direction.
Since this would be very computationally expensive
2.3 Implementation in Gimp and impart a large memory demand with our caching
scheme, we restricted this search to a maximal distance
With a few minor differences mentioned below, we of 64*3 pixel. In the future, this could be improved
fully implemented all three steps (Expansion of with a more intelligent access scheme for pixels outside
Known Regions, Sample Selection and Matte of the cache, and therefore more accurate results for big
Computation and Local Smoothing) explained in images.
the paper. The implementation of the core algorithm is
about about 1600 lines of code entirely written by us.
Additionally, various GUI changes had to be made in
many different files, that probably also add up to a few
hundred lines of code.
2
Figure 1: Input and result from a simple extraction
3 Results
3.1 Images
In Figure 1, one can see that the method gives quite
good results, although there are still some artefacts
left from bright background spots. The extraction
was performed on a laptop with a touchpad, and took
about 2 minutes for the image of size 800x533. This is
including all user interaction and multiple runs of the
algorithm. Running the algorithm once took about 5
seconds, with the image having 25% unknown pixels.
3
Figure 3: Parts of extracted foreground, alpha value and estimated foreground and background colors after each
step of the algorithm (expansion, search, comparison, smoothing)
4
More improvement could also be done to our painting
interface. In contrast to our simple approach, a more
sophisticated way could be implemented. For example
filling holes, or even a flooding approach using detection
of edges in the image, like mentioned in the paper. This
would simplify the user input and might also improve
the quality of the result.