A Modern C++ Point of View of Programming in Image Processing - 2022
A Modern C++ Point of View of Programming in Image Processing - 2022
Image Processing
Michaël Roynard, Edwin Carlinet, Thierry Géraud
Edwin Carlinet
[email protected]
Thierry Géraud
[email protected]
1
Abstract
C++ is a multi-paradigm language that en-
ables the programmer to set up efficient im-
age processing algorithms easily. This language
strength comes from many aspects. C++ is
high-level, so this enables developing power-
ful abstractions and mixing different program-
ming styles to ease the development. At the
same time, C++ is low-level and can fully
take advantage of the hardware to deliver the Figure 1: The watershed segmentation algo-
best performance. It is also very portable and rithm runs on a 2D-regular grayscale image
highly compatible which allows algorithms to (left), on a vertex-valued graph (middle) and
be called from high-level, fast-prototyping lan- on a 3D mesh (right).
guages such as Python or Matlab. One of the
guage which makes C++ easily interoperable with
most fundamental aspects where C++ really
high-level prototyping languages. This is why the
shines is generic programming. Generic pro- performance-sensitive features of many image pro-
gramming makes it possible to develop and cessing libraries (and numerical libraries in gen-
reuse bricks of software on objects (images) of eral) are implemented in C++ (or C/Fortran as
different natures (types) without performance in OpenCV [7], IPP [11]) or with a hardware-
loss. Nevertheless, conciliating genericity, effi- dedicated language (e.g. CUDA [8]) and are ex-
ciency, and simplicity at the same time is not posed through a high-level API to Python, LUA. . .
trivial. Modern C++ (post-2011) has brought Apart from the performance considerations, the
new features that made it simpler and more problem lies in that each image processing field
comes with its own set of image type to process.
powerful. In this paper, we will focus in par-
Obviously, the most common image type is an im-
ticular on some C++20 aspects of generic pro- age of RGB or gray-level values, encoded with 8-
gramming: ranges, views, and concepts, and bits per channel, on a regular 2D rectangular do-
see how they extend to images to ease the de- main that covers 90% of common usages. How-
velopment of generic image algorithms while ever, with the development of new devices has come
lowering the computation time. new image types: 3D multi-band images in Medi-
Keywords — Image processing, Generic Pro- cal Imaging, hyperspectral images in Astronomi-
gramming, Modern C++, Software, Performance cal Imaging, images with complex values in Signal
Processing. . . Some devices generate images with a
depth channel which is encoded with a number of
1 Introduction bits different from the other channels. . . An image
processing library able to handle those images type
C++ claims to “leave no room for a lower-level lan- would cover 99% of use cases. Finally, the remain-
guage (except assembler)” [38] which makes it a ing 1% would cover the usage of esoteric image
go-to language when developing high-performance types.
computing (HPC) image processing applications. In Digital Topology, we have to deal with non-
The language is designed after a zero-overhead regular domain where pixels are not regular pixels.
abstraction principle that allows us to devise a They might be super-pixels produced by a segmen-
high-level but efficient solution to image process- tation algorithm, hexagonal pixels, pixels defined
ing problems. Others aspects of C++ are its sta- on some special grids (e.g. the cairo pattern [19]) or
bility, its portability on a wide range of archi- even meshes’ vertices. In Mathematical Morphol-
tectures, and its direct interface with the C lan- ogy, most image operators are defined on a graph
2
void dilate_rect(image2d_u8 in, image2d_u8 out, int w, int h) { SE
for (int y = 0; y < out.height(); ++y) Possible uses of the dila-
for (int x = 0; x < out.width(); ++x) { tion with a square SE.
Square
uint8_t s = 0;
for (int qy = y - h/2; qy <= y + h/2; ++y)
for (int qx = x - w/2; qx <= x + w/2; ++x)
if (0 <= qy <= in.height() && 0 <= qx <= in.width())
s = max(s, input(qx, qy)); Diamond
out(x,y) = s;
} Ball
} 2D-buffer 3D-buffer graph
Structure
Figure 2: Non-generic dilation algorithm for 8- 16-bits int
8-bits RGB
framework and are naturally extended to a hierar-
chical representation of the image (e.g. operators Values
on hierarchies of segmentation [28], trees [12] or a
shape space [44]). The fact that image processing Figure 3: The combinatorial set of inputs that
is related to many fields has already led Järvi to a dilation operator may handle.
wonder about how they can easily adapt types to
fit different image formalism [22]. “genericity” in order to write a generic version of an
From a programming standpoint, the ability to algorithm.
run the same algorithm (code) over a different set of With Ad-hoc polymorphism (A), one has to write
image types, as shown in fig. 1, is called genericity. one implementation for each image type which in-
This term was defined by Musser in [30] as follows: volves code duplication to be exhaustive. The abil-
“By generic programming we mean the definition ity to select which implementation will run is based
of algorithms and data structures at an abstract on the “real” type of the image. In C++, if this in-
or generic level, thereby accomplishing many re- formation is known at compile time (static), the
lated programming tasks simultaneously. The cen- compiler selects the right implementation by it-
tral notion is that of generic algorithms, which self (static dispatch by overload resolution). If the
are parametrized procedural schemata that are com- “real” type of the image is known dynamically, one
pletely independent of the underlying data represen- has to select the correct implementation by hand
tation and are derived from concrete, efficient algo- by writing boilerplate code.
rithms.” To illustrate our point, we will consider a With Generalization (B), one has to consider a
simple yet complex enough image operation: the common type for all images (let us name it super-
dilation of an image f by a flat structuring element type) and write algorithms for this common type.
(SE) B defined as It implies conversion back and forth between the
super-type and other image types for every compu-
_ tation.
g(x) = f (y) (1)
With Inclusion Polymorphism, Dynamic Traits
y∈Bx
(C), one has to define an abstract type featuring
Simply said, it consists in taking the supremum all common image operations. For example, one
of the values in region B centered in x. Despite may consider that all images must define an opera-
the apparent simplicity, this operator allows a high tor get_value(Point p) -> Any where Point is a
variability of the inputs. f can be a regular 2D im- type able to contain any point value (2d, 3d, graph
age as well as a graph; values can be grayscale as vertex. . . ) and Any a type able to hold any value.
well as colors; the SE can be rectangle as well as a This is generally achieved using inclusion polymor-
disc adaptive to the local content. . . The straight- phism in Object-Oriented Programming with an
forward implementation in fig. 2 covers only one interface and/or abstract type AbstractImage for
possible set of parameters: the dilation of 8-bits all image types. It may also be achieved using
grayscale 2D-images by a rectangle. The combina- more modern techniques such as type-erasure with
torial set of parameters increases drastically with a type AnyImage (that has the same interface as
the types of the inputs as seen in fig. 3. In [34], the AbstractImage) for which any image could be con-
authors depict four different approaches to leverage verted to. Whatever technique used behind the
3
template <Range R> template <typename T>
scene relies on a dynamic dispatch at runtime to requires MaxMonoid<value_t<R>> concept MaxMonoid =
auto maxof(R col) { requires(T x) {
resolve which interface method is called. value_t<R> s = 0; { T v = 0; };
Parametric Polymorphism, Generics, Static for (auto e : col)
s = max(s, e); }
{ x = max(x, x); };
4
template <class I, class SE> template <class I>
void dilation(I in, I out, SE se) { concept Image = requires {
for (auto p : out.domain()) { point_t<I>; // Type of point (P)
value_t<I> s = min_of_v<value_t<I>>; value_t<I>; // Type of value (V)
for (auto q : se(p)) } && requires (I f, point_t<I> p, value_t<I> v) {
s = max(s, input(q)) { v = f(p) }; //
output(p) = s; { f(p) = v }; // optional, for output
} { f.domain() } -> Range; // (actually Range of P)
} };
template <class SE, class P>
concept StructuringElement =
Figure 5: Generic dilation algorithm. requires (SE se, P p) {
{ se(p) } -> Range; // (actually Range of P)
};
“max” and a neutral element “0”. Actually (1) is
template <Image I, class SE>
abstracted by pairs of iterators in the STL and void dilation(I input, I output, SE se)
ranges in C++20, while C++20 introduces con- requires MaxMonoid<value_t<I>>
&& StructuringElement<SE, point_t<I>>
cepts to check if a type follows the requirements of { ... }
5
template <Image I, class SE> // (1)
by the authors in [34]. We performed the image requires MaxMonoid<value_t<I>> &&
StructuringElement<SE, point_t<I>>
processing concept extraction and made it available void dilation(I input, I output, SE se)
alongside the image processing library Pylene [13]. { /* Generic impl. */ }
6
(
0 if x < 150
auto h = [](int x)
255 if x ≥ 150
clip( , DiamondShape ROI ) →
auto u =
)
er
h nt
oi
(p
u [](int x) {
auto v = transform( u , h ) → ≡ filter( , return (x % 2) == 0; ) →
}
7
portant feature in a pipeline design (generally, in auto operator+(Image A, Image B) {
software engineering) is object composition. It en- }
return transform(A, B, std::plus<>());
ables composing simple blocks into complex ones. auto togray = [](Image A) { return transform(A, [](auto x)
{ return (x.r + x.g + x.b) / 3.f; };)
Those complex blocks can then be managed as if };
auto subquantize16to8b = [](Image A) { return transform(A,
they were still simple blocks. In fig. 10, we have [](float x) { return uint8_t(x / 256 +.5f); });
3 simple image operators Image → Image (the };
8
too. This brings the practitioner to a productivity image level. The code has become more readable,
gain. more expressive and more efficient by default.
auto alphablend =
3.3 Reasoning at image level +
[](auto ima1, auto ima2, float alpha) {
return alpha * ima1 +
(1 - alpha) * ima2; };
9
Background Grayscale Opening
mation pipeline is applied on the RDD and compu- (RGB-8) Conversion
Substract Thresholding
(Erosion+Dilation)
10
float kThreshold = 150; float kVSigma = 10;
float kHSigma = 10; int kOpeningRadius = 32;
auto img_gray = view::transform(img_color, to_gray);
auto bg_gray = view::transform(bg_color, to_gray);
rithmic specialization based on runtime conditions
is not trivial. It requires ahead-of-time generation
auto bg_blurred = gaussian2d(bg_gray, kHSigma, kVSigma);
auto tmp_gray = img_gray - bg_blurred; /
of specializations that increases compile times and
auto thresholdf = [](auto x) { return x < kThreshold; };
auto tmp_bin = view::transform(tmp_gray, thresholdf); /
auto ero = erosion(tmp_bin, disc(kOpeningRadius)); does not scale with the parameter space size, or
dilation(ero, disc(kOpeningRadius), output); it requires switching to a more dynamic paradigm
that could degrade performances. Dealing with dy-
Figure 18: Pipeline implementation with
namic should not be an option when it comes down
views . Highlighted code uses views by pre- to exposing a static library to a dynamic language
fixing operators with the namespace view. like Python. As a future work, we will research
ways to address this issue.
OpenCV for blur and dilation/erosion) so that the
comparison makes sense. It allows us to validate ex-
perimentally the advantages of views in pipelines. References
First, we have to be cautious about the real ben-
efit in terms of processing time. Here, most of [1] Martín Abadi et al. TensorFlow: Large-scale
the time is spent in algorithms that are not eli- machine learning on heterogeneous systems,
gible for view transformation. Thus, depending on 2015. Software available from tensorflow.org.
the operations of the pipeline, views may not im-
prove processing time. Nevertheless, using views [2] B. Andres, U. Koethe, T. Kroeger, and
does not degrade performance neither (only 1% in F.A. Hamprecht. Runtime-flexible multi-
this experiment). It seems to show that using views dimensional arrays and views for C++98
does not introduce performance penalties and may and C++0x. arXiv preprint arXiv:1008.2909,
even be beneficial in lightweight pipelines as the IWR, Univ. of Heidelberg, Germany, 2010.
one in section 3. On the memory side, views re-
[3] Apache Software Foundation. Hadoop.
duce drastically the memory usage which is benefi-
cial when developing applications which are mem- [4] G. Berti. GrAL–the grid algorithms library.
ory constrained. From the developer standpoint, Future Generation Computer Systems, 22(1-
it requires only few changes in the code as shown 2):110–122, 2006.
in fig. 18 — the implementation of the algorithms
remain the same — which is a real advantage for [5] Boost. Boost c++ libraries.
software maintenance.
[6] L. Bourdev. Generic image library. http://
www.lubomir.org/pdfs/GIL_SDJ.pdf, 2020.
6 Conclusion [7] G. Bradski. The OpenCV library. Dr.
Thanks to simple yet concrete examples, we have Dobb’s Journal of Software Tools, 25:122–125,
shown how modern C++ and the generic program- November 2000.
ming paradigm can ease image processing software
[8] F. Brill and E. Albuz. NVIDIA VisionWorks
development. We have given a particular focus
toolkit. Presented at the 2014 GPU Technol-
to the concepts of image views and have shown
ogy Conference, 2014.
that they improve both performance and usabil-
ity of an image processing framework. These ideas [9] G. Brown, C. Di Bella, M. Haidl, T. Remmelg,
have been implemented in our C++20 library [13] R. Reyes, and M. Steuwer. Introducing paral-
and used for concrete image processing applications lelism to the ranges TS. In Proceedings of the
(medical imaging and document analysis). We have International Workshop on OpenCL, pages 1–
compared our design to existing similar design in 5, 2018.
data flow oriented programming and outlined the
main differences. Nonetheless, generic program- [10] Nicolas Burrus et al. A static C++ object-
ming in C++ comes with some downsides. Tem- oriented programming (SCOOP) paradigm
plates belong to the static world and selecting algo- mixing benefits of traditional OOP and generic
11
programming. In Proceedings of the Workshop [21] H. Homann and F. Laenen. SoAx: A generic
on Multiple Paradigm with Object-Oriented C++ structure of arrays for handling particles
Languages (MPOOL), Anaheim, CA, USA, in HPC codes. Computer Physics Communi-
October 2003. cations, 224:325–332, 2018.
[11] I. Burylov, M. Chuvelev, B. Greer, G. Henry, [22] J. Järvi, M.A. Marcus, and J.N. Smith. Li-
S. Kuznetsov, and B. Sabanin. Intel perfor- brary composition and adaptation using C++
mance libraries: Multi-core-ready software for concepts. In Proceedings of the 6th Inter-
numeric-intensive computation. Intel Technol- national Conference on Generative Program-
ogy Journal, 11(4), 2007. ming and Component Engineering, pages 73–
82, 2007.
[12] E. Carlinet et al. MToS: A tree of shapes for
multivariate images. IEEE Transactions on [23] E. Jones, T. Oliphant, P. Peterson, et al.
Image Processing, 24(12):5330–5342, 2015. SciPy: Open source scientific tools for Python,
2001–. http://www.scipy.org.
[13] E. Carlinet et al. Pylena: a modern C++ im-
age processing generic library, 2018. https: [24] Khronos Group. OpenVX. https://www.
//gitlab.lrde.epita.fr/olena/pylene. khronos.org/openvx/, 2019.
[14] D. Coeurjolly, J.-O. Lachaud, and B. Ker- [25] U. Köthe. STL-style generic programming
autret. DGtal: Digital geometry tools and al- with images. C++ Report Magazine, 12(1):24–
gorithms library, 2019. https://dgtal.org/. 30, 2000. https://ukoethe.github.io/
vigra.
[15] Jeffrey Dean and Sanjay Ghemawat. Mapre-
duce: Simplified data processing on large clus- [26] R. Levillain et al. Why and how to design a
ters. Commun. ACM, 51(1):107–113, January generic and efficient image processing frame-
2008. work: The case of the Milena library. In Pro-
ceedings of the IEEE Intl. Conf. on Image Pro-
[16] J.C. Dehnert and A. Stepanov. Fundamen- cessing (ICIP), pages 1941–1944, Hong Kong,
tals of generic programming. In Generic Pro- 2010.
gramming, volume 1766 of LNCS, pages 1–11.
Springer, 2000. [27] R. Levillain et al. Practical genericity: Writing
image processing algorithms both reusable and
[17] Thierry Géraud et al. Semantics-driven gener- efficient. In Proc. of the 19th Iberoamerican
icity: A sequel to the static C++ object- Congress on Pattern Recognition (CIARP),
oriented programming paradigm (SCOOP 2). volume 8827 of LNCS, pages 70–79. Springer,
In Proceedings of the 6th International Work- 2014.
shop on Multiparadigm Programming with
Object-Oriented Languages (MPOOL), Pa- [28] F. Meyer and J. Stawiaski. Morphology on
phos, Cyprus, July 2008. graphs and minimum spanning trees. In Proc.
of the Intl. Symp. on Mathematical Morphol-
[18] J. Y. Gil and R. Kimmel. Efficient dila- ogy (ISMM), volume 5720 of LNCS, pages
tion, erosion, opening, and closing algorithms. 161–170. Springer, 2009.
IEEE Transactions on Pattern Analysis and
Machine Intelligence, 24(12):1606–1617, 2002. [29] C. Misale, M. Drocco, G. Tremblay, and
other. PiCo: High-performance data analytics
[19] B. Grünbaum and G. C. Shephard. Tilings pipelines in modern C++. Future Generation
and Patterns. W. H. Freeman & Co., 1986. Computer Systems, 87:392–403, 2018.
[20] Dries Harnie et al. Scaling machine learning [30] David R. Musser and Alexander A. Stepanov.
for target prediction in drug discovery using Generic programming. In Intl. Symp. on Sym-
apache spark. Future Generation Computer bolic and Algebraic Computation, pages 13–25.
Systems, 67:409–417, 2017. Springer, 1988.
12
[31] E. Niebler and C. Carter. P1037R0: Deep in- [42] T. L. Veldhuizen. Blitz++: The library that
tegration of the ranges TS, May 2018. https: thinks it is a compiler. In Advances in Software
//wg21.link/p1037r0. Tools for Scientific Computing, volume 10 of
Lecture Notes on Computational Science and
[32] B. Perret, G. Chierchia, J. Cousty, S.J. F. Engineering, pages 57–87. Springer, 2000.
Guimarães, Y. Kenmochi, and L. Najman. Hi-
gra: Hierarchical graph analysis. SoftwareX, [43] M. Werner. GIS++: Modern C++ for effi-
10:100335, 2019. cient and parallel in-memory spatial comput-
ing. In Proc. of the ACM SIGSPATIAL Intl.
[33] G. X. Ritter, J. N. Wilson, and J. L David- Workshop on Geospatial Data Access and Pro-
son. Image algebra: An overview. Com- cessing APIs, pages 1–2, 2019.
puter Vision, Graphics, and Image Processing,
49(3):297–331, 1990. [44] Y. Xu, T. Géraud, and L. Najman. Con-
nected filtering on tree-based shape-spaces.
[34] M. Roynard, E. Carlinet, and T. Géraud. IEEE Transactions on Pattern Analysis and
An image processing library in modern C++: Machine Intelligence, 38(6):1126–1140, 2015.
Getting simplicity and efficiency with generic
programming. In Reproducible Research in [45] Matei Zaharia et al. Resilient distributed
Pattern Recognition—2nd Intl. Workshop, vol- datasets: A fault-tolerant abstraction for in-
ume 11455 of LNCS, pages 121–137. Springer, memory cluster computing. In 9th USENIX
2019. Symposium on Networked Systems Design and
Implementation (NSDI 12), pages 15–28, San
[35] R. Smith. N4849: Working draft, standard Jose, CA, April 2012. USENIX Association.
for programming language C++. Technical
report, January 2020. https://wg21.link/
n4849.
13