0% found this document useful (0 votes)
109 views

Gimp Alpha Matting

The document summarizes an implementation of alpha matting in GIMP using an algorithm described by Eduardo S. L. Gastal and Manuel M. Oliveira. Key aspects include: 1) The algorithm was implemented in C to run efficiently within GIMP's architecture using tile caching and runs in real-time. 2) Debugging was challenging due to the low-level C code so various debug features were added like saving cache contents. 3) The algorithm generates a trimap from user input and estimates foreground/background for unknown pixels using color distances to known pixels.

Uploaded by

Andrea Huszti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
109 views

Gimp Alpha Matting

The document summarizes an implementation of alpha matting in GIMP using an algorithm described by Eduardo S. L. Gastal and Manuel M. Oliveira. Key aspects include: 1) The algorithm was implemented in C to run efficiently within GIMP's architecture using tile caching and runs in real-time. 2) Debugging was challenging due to the low-level C code so various debug features were added like saving cache contents. 3) The algorithm generates a trimap from user input and estimates foreground/background for unknown pixels using color distances to known pixels.

Uploaded by

Andrea Huszti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Alpha Matting in Gimp

Jan Rüegg, Johann Wolf

June 24, 2011

1 Introduction 2.1.2 Debugging

Gimp is a freely available open source software for image Since Gimp is implemented in C, the code runs very
manipulation. Though with much functionality the efficiently and there are many ways on how to optimize
software still misses a decent foreground selection tool. the code. However, working on such low level code
The alpha matting algorithm described by Eduardo S. imparts problems like hard to find bugs and more
L. Gastal and Manuel M. Oliveira [2] promised to be the complicated ways of solving problems. Additionally we
right choice for such a tool, having a good performance had to familiarize ourselves with many different libraries
and accuracy. We show that it has clear advantages over that are used, like Gtk, Glib, and the Gimp API itself,
the current foreground-extraction tool in gimp, that that has little to no documentation.
uses an older method called Siox [1].
To be able to properly test and debug the code,
The algorithm is characterized by being very fast we added many different debug features, mostly in the
with real time calculation capabilities. This is done by form of defines that can be disabled. For example, we
restricting most of the computations on a localized area wrote routines to save the content of the tile cache to a
and furthermore being able to do the computations in .ppm file, or to load and save the user selection from/to
parallel on a GPU. Since Gimp only supports compu- a file. That means, during testing, one doesn’t have to
tations on a GPU in a special development branch, we draw the selection again and again. It is also possible
decided to leave out these optimizations for the sake of to not only get the final image as a result, but also the
providing the tool to a broader audience. estimated foreground, background or alpha value alone.
Finally, we are able to stop the algorithm after each of
the four steps to see the in-between results.
2 Algorithm
Additionally, we used various tools (like memory
2.1 Working with Gimp leak checkers and profilers) to improve speed and
2.1.1 Data Layout correctness of our implementation.

Before extending Gimp with a new tool there are many


2.2 Alpha Matting
details that have to be considered. For performance rea-
sons Gimp has a characteristic way of storing images, As a first step we generate a trimap from the input
namely in tiles. These tiles are of the dimensions 64x64. the user has provided, specifying foreground and
Access to pixels through these tiles is much more efficient background pixels. We take these pixels as being
than through direct access. However, this layout has big truthful, so it is important that these markings are
implications on the ease of implementing an algorithm: made catiously. From this basis all unmarked pixels
• If we want to access pixels in the neighborhood of a look for a known nearest neighbor that has a specific
pixel, we have to either include a special case for the minimal color distance and we assign that pixel to this
pixels on the border (since we access pixels that are known foreground/background neighbor. If no such
outside of the tile) or we have to introduce a different neighbor can be found within range we mark this pixel
data layout that is larger than the tiles themselves. as unknown, ending up with a trimap of foreground,
This is necessary because only a limited amount of background and unknown pixels.
tiles can be loaded at one time, and there is overhead
in loading and releasing tiles. In the next step we go through the unknown pix-
els of the first step. We look into four directions for a
• Another issue is that not all the tiles are of dimen- pixel, searching for each direction the closest foreground
sion 64x64, because naturally the dimensions of im- and background pixel. These directions are different
ages don’t have to be multiples of 64. This imparts for the pixels in the neighborhood, so that when we
that we have to take into account special cases when compare the retrieved values later on we will have
using tiles. implicitly searched in many directions for each pixel.

1
images as a hash key and chained the entries together
Once we’ve got the values - possibly four values in a linked list. This speeds up the algorithm greatly
each for foreground and background - we will find by avoiding unnecessary access to already known pixels
the best foreground/background pair to model our and thus prevents many tiles from being loaded into the
pixel color. This is done with an objective function, cache. It also reduces the memory needed to process
taking into account the distance of the found pixels, the the pixels, compared to having whole image-layers for
fluctuations of the color gradient on the way of finding storing the intermediate results. However, frequently
these pixels, and the chromatic distortion around the retrieving hash entries proved to be inefficient. Using a
pixel in question. profiler we discovered that this amounted for up to 80%
of the time for the computations!
After we’ve found a good combination of fore- To solve this problem we introduced a second, much
ground/background colors for each unknown pixel smaller cache for the hash entries that is used during
we go through the entries again. This time we search processing. After the cache has been processed, the data
for the three best color combinations for a pixel in its is written back to the hash table.
neighbourhood. Depending on the variance we take
an average of these three values for the foreground An optimisation we made as opposed to the origi-
and/or background or stay with the original value as an nal algorithm is when accessing neighbors of a pixel.
estimate. The method described in the paper was always to search
and compare other pixels in a certain radius. Instead
Following up on this we retrieve the final back- we often did the comparisons in a square around the
ground/foreground colors by performing a gaussian, original pixel, making the access pattern easier and
weighted relative to the alpha value and the confidence using the caches in a better way. The only downside is
of the colors. We then calculate the final confidence and that a neighbor in the corner of the square might be
use this result to calculate the final alpha value. selected instead of a spatially closer neighbor along the
edge in certain cases.
A few specific details of the algorithm as described
in the paper were not entirely clear to us. Therefore, Another difference to the paper is the range of the
we contacted the authors of the paper. Eduardo Gastal search for foreground/background pixels in the four
was kind enough to provide the clarifications we needed, directions. In the paper the step size is 6 pixels with a
so that we were able to implement the algorithm in all maximum number of 300 steps, amounting to a maximal
details. distance of 1800 pixels for a search in one direction.
Since this would be very computationally expensive
2.3 Implementation in Gimp and impart a large memory demand with our caching
scheme, we restricted this search to a maximal distance
With a few minor differences mentioned below, we of 64*3 pixel. In the future, this could be improved
fully implemented all three steps (Expansion of with a more intelligent access scheme for pixels outside
Known Regions, Sample Selection and Matte of the cache, and therefore more accurate results for big
Computation and Local Smoothing) explained in images.
the paper. The implementation of the core algorithm is
about about 1600 lines of code entirely written by us.
Additionally, various GUI changes had to be made in
many different files, that probably also add up to a few
hundred lines of code.

As previously stated, Gimp uses tiles of size 64x64


to store the images. This complicates working in an
area around a pixel (as it is required in this step),
since in many cases we need to access pixels outside of
the current tile. To simplify this and reduce frequent
loading/releasing of tiles, we introduced a cache bigger
than the actual tiles. Depending on the current step, we
load either 1x1 or 3x3 tiles into the cache, work on the
cache and save the results back.

To keep track of the unknown pixels, which are


the only ones we have to process after the first step, we
introduced a hash table with the x/y coordinates of the

2
Figure 1: Input and result from a simple extraction

3 Results
3.1 Images
In Figure 1, one can see that the method gives quite
good results, although there are still some artefacts
left from bright background spots. The extraction
was performed on a laptop with a touchpad, and took
about 2 minutes for the image of size 800x533. This is
including all user interaction and multiple runs of the
algorithm. Running the algorithm once took about 5
seconds, with the image having 25% unknown pixels.

The comparison in Figure 2 clearly shows the


advantages over the shipped foreground selection tool
in Gimp. The algorithm called Siox [1] doesn’t support
semitransparency in extractions: The final alpha mask
is only binary. This way, background color can be leaked
into border pixels. Also, details like fine hair cannot
be modelet well. This greatly limits the use of such a
tool since such an extraction can also be done manually
quite well.

Figure 3 shows how every step of the algorithm


Figure 2: input image, extracted foreground from orig-
improves the result. We show, for every sub-step, the
inal tool and of our implementation and alpha mask of
extracted foreground in front of a black background,
original tool and of our implementation
the alpha mask and the estimated background and

3
Figure 3: Parts of extracted foreground, alpha value and estimated foreground and background colors after each
step of the algorithm (expansion, search, comparison, smoothing)

4
More improvement could also be done to our painting
interface. In contrast to our simple approach, a more
sophisticated way could be implemented. For example
filling holes, or even a flooding approach using detection
of edges in the image, like mentioned in the paper. This
would simplify the user input and might also improve
the quality of the result.

For the means of speed there should be several


ways to improve our code by using more efficient
approaches. Furthermore we don’t use parallelisation at
all. However, multiple threads are supported in Gimp
and it should be quite straightforward to make use of
this for the algorithm.
Figure 4: Implemented GUI elements
Furthermore, we restricted our algorithm to cer-
foreground colors. For the later three images the white tain types of images. Gimp supports many different
regions to the left represent the foreground marked by colorspaces, indexed images and images with and
the user and on the right the background. Combining without alpha channel. Since our algorithm works
the estimated foreground and background images ac- on the raw image data, it relies heavily on the color
cording to the alpha mask should give the corresponding layout, and currently assumes regular r/g/b data.
section of the original image. However, it should be easy to extend it to other types
of images, doing a simple conversion to r/g/b in between.

We’re planning on including these changes into the


official Gimp codebase, and will contact the Gimp
3.2 GUI
developers about it in the near future. During our work
Besides the algorithm itself, we implemented some GUI we frequently pulled the upstream changes to ensure
elements that give the user more interactive control over compatibility with the original code. We’re hoping that
the tool, as you can see in Figure 4. Besides the brush- our tool will provoke many positive reactions and that
ing interface and the mask preview, that we partly reused it will be included in a future Gimp release.
from the old implementation, we added a percentage bar
defining when to start the algorithm. As there should
not be too many unknown pixels in a trimap-based ap-
Code
proach, we do not waste time with computing initial The original code repository can be found under
rough strokes. We also added the possibility to entirely http://git.gnome.org/browse/gimp/
disable the calculations for refining strokes and to set
regions to unknown again after accidentally marking re- Our fork, including the changes, is hosted on GitHub
gions wrong. http://github.com/rggjan/Gimp-Matting

4 Conclusion & Outlook References


The results are very promising and seem of good quality [1] L. Knipping R. Rojas G. Friedland, K. Jantz. Image
to us. The time for the computations is reasonable and segmentation by uniform color clustering. 2005.
the ability to tweak the results after a run are very con-
venient. Compared to the built in foreground selection [2] Eduardo S. L. Gastal & Manuel M. Oliveira. Shared
tool in Gimp the difference is immense. However, there sampling for real-time alpha matting. 2010.
are still a few points on how we can improve on our tool:

First of all a more sophisticated (incremental) way


of altering the trimap after a run of the algorithm could
be provided. Currently after refining the trimap through
user input, the whole algorithm will be run again on all
pixels. In this case only pixels in the region of added
strokes could be processed again.

You might also like