0% found this document useful (0 votes)
15 views

GPP The Generic Preprocessor

GPP is a generic preprocessor tool that programmatically alters source code and text files based on inline annotations. Preprocessors are used to extend or translate programming and markup languages as well as conditionally generate source code and text. While early preprocessors were specialized, GPP is a flexible, general-purpose preprocessor whose syntax and behavior can be customized, making it useful for research applications involving novel languages.

Uploaded by

Babak Zamaanie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

GPP The Generic Preprocessor

GPP is a generic preprocessor tool that programmatically alters source code and text files based on inline annotations. Preprocessors are used to extend or translate programming and markup languages as well as conditionally generate source code and text. While early preprocessors were specialized, GPP is a flexible, general-purpose preprocessor whose syntax and behavior can be customized, making it useful for research applications involving novel languages.

Uploaded by

Babak Zamaanie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

GPP, the Generic Preprocessor

Tristan Miller
Austrian Research Institute for Artificial Intelligence
Freyung 6/3, 1010 Vienna, Austria
ORCID: 0000-0002-0749-1100
Denis Auroux
Department of Mathematics, Harvard University
arXiv:2008.00840v1 [cs.PL] 3 Aug 2020

1 Oxford Street, Cambridge, MA 02138, USA

Summary at providing even higher-level constructs, such as


conditional loops and other control structures in
In computer science, a preprocessor (or macro pro- FORTRAN (Meissner, 1975) and COBOL (Tri-
cessor ) is a tool that programatically alters its input, ance, 1980). The need for generalized, language-
typically on the basis of inline annotations, to pro- independent tools was eventually recognized (McIl-
duce data that serves as input for another program. roy, 1960), leading to the development of general-
Preprocessors are used in software development and purpose preprocessors such as GPM (Strachey, 1965)
document processing workflows to translate or ex- and ML/I (Brown, 1967).
tend programming or markup languages, as well By the end of the 1960s, preprocessors had at-
as for conditional or pattern-based generation of tracted a considerable amount of attention, by com-
source code and text. Early preprocessors were rela- puting theorists and practitioners alike, and their
tively simple string replacement tools that were tied use in software engineering had expanded beyond
to specific programming languages and application the augmentation and adaptation of programming
domains, and while these have since given rise to languages. A survey paper by Brown (1969) iden-
more powerful, general-purpose tools, these often tified four broad application areas: language ex-
require the user to learn and use complex macro tension, systematic searching and editing of source
languages with their own syntactic conventions. In code, translation between programming languages,
this paper, we present GPP, an extensible, general- and code generation (i.e., simplifying the writing of
purpose preprocessor whose principal advantage is highly repetitive code, parameterizing a program by
that its syntax and behaviour can be customized to substituting compile-time constants, or producing
suit any given preprocessing task. This makes GPP variants of a program by conditionally including cer-
of particular benefit to research applications, where tain statements or modules). While the first three of
it can be easily adapted for use with novel markup, these application areas have largely been rendered
programming, and control languages. obsolete by today’s integrated development envi-
ronments and expressive, feature-rich programming
languages, implementing software variability with
Background language-specific and general-purpose preprocessors
remains commonplace (Apel et al., 2013; Kstner
Preprocessors date back to the mid-1950s, when et al., 2012).
they were used to extend individual assembly lan- Text processing became another main application
guages with constructs that would later be found in area for preprocessors, in particular to generate doc-
high-level programming languages (Layzell, 1985). uments on the basis of user-specified conditions or
These languages, in turn, fostered the development patterns, and to convert between document markup
of yet more special-purpose preprocessors aimed languages (Walden, 2014). The earliest such uses

This is a preprint of the following publication:


Tristan Miller and Denis Auroux. GPP, the generic preprocessor. Journal of Open Source Software, 5(51), July
2020. ISSN 2475-9066. DOI: 10.21105/joss.02400
were ad-hoc repurposings of programming language– and Weinberg, 2020). While GPP is less powerful
specific preprocessors to operate on human-readable than m4 (Seindal et al., 2016), it is arguably more
texts (Keese, 1964; Stallman and Weinberg, 2020); flexible, and supports all the basic operations ex-
these were soon supplanted by text-specific macro pected of a modern, high-level preprocessing system,
languages such as TRAC (Mooers and Deutsch, including conditional tests, arithmetic evaluation,
1965), which were positioned as tools for stenogra- and POSIX-style wildcard matching (“globbing”).
phers and other writing professionals. More recently In addition to macros, GPP understands comments
it has been common to use general-purpose prepro- and strings, whose syntax and behaviour can also
cessors (Mailund, 2019; Pesch, 1992). be widely customized to fit any particular purpose.

Statement of Need GPP in research


Criticism of preprocessors commonly focuses on the GPP has already been integrated into a number of
idiosyncratic languages they employ for their own third-party projects in basic and applied research.
built-in directives and for users to define and in- These include the following:
voke macros. The languages of early preprocessors
were derided as “clumsy and restrictive” (Layzell, • The Waveform Definition Language (WDL) is
1985) and “hard to read” (Brown, 1969), and even Caltech Optical Observatories’ C-like language
modern preprocessors are sometimes attacked for for programming astronomical research cam-
relying on “the clumsiness of a separate language eras. WDL uses GPP to preprocess configura-
of limited expressiveness” (Ernst et al., 2002) or, tion files containing signals and parameters spe-
at the other extreme, for being overly complicated, cific to the camera controllers, flags setting the
quirky, opaque, or hard to learn, even for experi- devices’ operating modes and image properties,
enced programmers and markup users (Ernst et al., and timing rules. According to the develop-
2002; Paddon, 1993; Pesch, 1992). ers, GPP was chosen over the C Preprocessor
Our general-purpose preprocessor, GPP, avoids “for added flexibility and to avoid some C-like
these issues by providing a lightweight but flexible limitations” (Kaye et al., 2017).
macro language whose syntax can be customized
• XSB is a research-oriented, commercial-grade
by the user. The tool’s built-in presets allow its
logic programming system and Prolog compiler.
directives to be made to resemble those of many
The developers chose to make GPP XSB’s de-
popular languages, including HTML and TEX. This
fault preprocessor because it “maintains a high
greatly reduces the learning curve for GPP when it is
degree of compatibility with the C preproces-
used with these languages, eliminates the cognitive
sor, but is more suitable for processing Prolog
burden of repeatedly “mode switching” between
programs” (Swift et al., 2017).
source and preprocessor syntax when reading or
composing, and allows existing syntax highlighters • C-Control Pro is a family of electronic mi-
and other tools to process GPP directives with little crocontrollers produced by Conrad Electronic;
or no further configuration. Furthermore, users are they are specifically designed for industrial and
not limited to using these presets, but can fully automotive applications. The official software
define their own syntax for GPP directives and development kit includes a modified version of
macros. This makes GPP particularly attractive for GPP for use with the products’ BASIC and
use in research and development, where its syntax Compact-C programming languages (Schirm
can be readily adapted to match novel programming and Sprenger, 2007).
and markup languages.
GPP’s independence from any one programming • SUS is a tool that allows system administrators
or markup language makes it more versatile than to exercise fine-grained control over how users
the C Preprocessor, which was formerly “abused” as can run commands with elevated privileges. It
a general text processor and is still sometimes (inap- has a sophisticated control file syntax that is
propriately) used for non-C applications (Stallman preprocessed with GPP (Gray, 2001).

2
Apart from these uses, GPP is occasionally cited as Dreiling, Alexander (July 2010). “Feature Mining:
previous or related work in scholarly publications Semiautomatische Transition von (Alt-)Systemen
on metaprogramming or compile-time variability of zu Software-Produktlinien”. Diploma thesis.
software (Apel et al., 2013; Baxter and Mehlich, Fakultt fr Informatik, Institut fr Technische
2001; Behringer, 2017; Blendinger, 2010; Dreiling, und Betriebliche Informationssysteme, Otto-von-
2010; Kstner et al., 2012; Lotoreychik and Shopyrin, Guericke-Universitt Magdeburg.
2006; Zmiry, 2016). Ernst, Michael D., Greg J. Badros, and David
Notkin (Dec. 2002). An Empirical Analysis of C
Preprocessor Use. IEEE Transactions on Software
Acknowledgments Engineering 28(12):1146–1170. issn: 0098-5589.
doi: 10.1109/TSE.2002.1158288.
Tristan Miller is supported by the Austrian Science
Gray, Peter D. (Dec. 2001). SUS An Object Ref-
Fund (FWF) under project M 2625-N31. Denis
erence Model for Distributing UNIX Super User
Auroux is partially supported by NSF grant DMS-
Privileges. Proceedings of the LISA 2001 15th Sys-
1937869 and by Simons Foundation grant #385573.
tems Administration Conference. The USENIX
The Austrian Research Institute for Artificial Intelli-
Association, pp. 15–18.
gence is supported by the Austrian Federal Ministry
Kstner, Christian et al. (June 2012). Type Checking
for Science, Research and Economy.
Annotation-Based Product Lines. ACM Trans-
actions on Software Engineering and Methodol-
References ogy 21(3):14:1–14:39. doi: 10 . 1145 / 2211616 .
2211617.
Apel, Sven et al. (Oct. 2013). Classic, Tool-Driven Kaye, Stephen et al. (2017). Waveform Definition
Variability Mechanisms. Feature-Oriented Soft- Language. Tech. rep. Pasadena, CA: Caltech Op-
ware Product Lines. Berlin/Heidelberg: Springer- tical Observatories.
Verlag. isbn: 978-3-642-37520-0. doi: 10.1007/ Keese Jr., W. M. (Sept. 1964). A Note on Auto-
978-3-642-37521-7_5. matic Generation of Documentation by Macro
Baxter, Ira D. and Michael Mehlich (2001). Pre- Assemblers. Technical memorandum TM-64-1031-
processor Conditional Removal by Simple Partial 1. Washington, DC: Bellcom, Inc.
Evaluation. Proceedings of the 8th Working Con- Layzell, P. J. (Jan. 1985). The History of Macro
ference on Reverse Engineering. IEEE, pp. 281– Processors in Programming Language Extensi-
290. isbn: 0-7695-1303-4. doi: 10 . 1109 / WCRE . bility. The Computer Journal 28(1):29–33. issn:
2001.957833. 0010-4620. doi: 10.1093/comjnl/28.1.29.
Behringer, Benjamin (July 2017). “Projectional Lotoreychik, V. Yu. and D. G. Shopyrin (2006).
Editing of Software Product Lines The PEoPL Metaprogrammirovaniye na osnove tekstovogo
Approach”. PhD thesis. Faculty of Sciences, Tech- preprotsessora [Text PreprocessorBased Metapro-
nology and Communication, Universit de Luxem- gramming]. Nauchno-Tehnicheskii Vestnik Infor-
bourg. matsionnykh Tekhnologii, Mekhaniki i Optiki [Sci-
Blendinger, Frank (Aug. 2010). “A Filesystem- entific and Technical Journal of Information Tech-
Based Approach to Support Product Line De- nologies, Mechanics and Optics] 6(2):57–65. issn:
velopment with Editable Views”. Diploma Thesis. 2226-1494.
Department of Computer Sciences 4, Friedrich- Mailund, Thomas (2019). Preprocessing. Introduc-
Alexander University Erlangen-Nuremberg. ing Markdown and Pandoc: Using Markup Lan-
Brown, P. J. (Oct. 1967). The ML/I Macro Proces- guage and Document Converter. Berkeley, CA:
sor. Communications of the ACM 10(10):618–623. Apress. isbn: 978-1-4842-5148-5. doi: 10.1007/
issn: 0001-0782. doi: 10.1145/363717.363746. 978-1-4842-5149-2_10.
Brown, P. J. (1969). A Survey of Macro Proces- McIlroy, M. Douglas (Apr. 1960). Macro Instruction
sors. Annual Review in Automatic Programming Extensions of Compiler Languages. Communica-
6:37–88. issn: 0066-4138. doi: 10.1016/0066- tions of the ACM 3(4):214–220. issn: 0001-0782.
4138(69)90001-9. doi: 10.1145/367177.367223.

3
Meissner, Loren P. (Sept. 1975). On Extending For-
tran Control Structures to Facilitate Structured
Programming. SIGPLAN Notices 10(9):19–30.
issn: 0362-1340. doi: 10.1145/987316.987320.
Mooers, Calvin N. and L. Peter Deutsch (Aug. 1965).
TRAC, a Text-Handling Language. ACM ’65:
Proceedings of the 20th National Conference. Ed.
by Lewis Winner. New York: Association for Com-
puting Machinery, pp. 229–246. isbn: 978-1-4503-
7495-8. doi: 10.1145/800197.806048.
Paddon, Michael (1993). Shake: A Portable Tool
for Generating Makefiles. AUUG ’93 Conference
Proceedings. Kensington, NSW, Australia: AUUG
Inc., pp. 145–156.
Pesch, R. H. (1992). Configurable Manuals. Confer-
ence Record on Crossing Frontiers, pp. 776–780.
isbn: 0-7803-0788-7. doi: 10.1109/IPCC.1992.
673146.
Schirm, Reiner and Peter Sprenger (2007). Der
Preprozessor. Messen, Steuern und Regeln mit
C-Control Pro: Praxisanwendungen, Schaltung-
stechnik und Programmierung. Poing, Germany:
Franzis. isbn: 978-3-7723-4097-0.
Seindal, Ren et al. (Dec. 2016). GNU M4, Version
1.4.18. Free Software Foundation.
Stallman, Richard M. and Zachary Weinberg (2020).
Overview. The C Preprocessor. GCC 10.1.0. Free
Software Foundation.
Strachey, C. (Jan. 1965). A General Purpose Macro-
generator. The Computer Journal 8(3):225–241.
issn: 0010-4620. doi: 10.1093/comjnl/8.3.225.
Swift, Theresa et al. (Oct. 2017). The XSB System,
Version 3.8.x, Volume 1: Programmer’s Manual.
Triance, J. M. (1980). Structured Programming
in COBOLThe Current Options. The Computer
Journal 23(3):194–200. doi: 10.1093/comjnl/
23.3.194.
Walden, David (2014). Macro Memories, 19642013.
TUGboat: The Communications of the TEX Users
Group 35(1):99–110.
Zmiry, Iddo E. (Apr. 2016). “Lola 0.064: A Pro-
gramming Language for Augmenting Program-
ming Languages”. MA thesis. Technion Israel
Institute of Technology.

You might also like