Using Functional Programming
Using Functional Programming
87
takes control of CPUs, memory and interrupt mappings. Next, people within the company who supported the use of OCaml,
it spawns a control domain—a small Linux VM that provides feeling that the risks of using a non-mainstream language were
system management services and provides physical device drivers worth taking in return for the efficiencies that the engineers
for networking and storage. claimed it would bring.
The main XenServer management process that resides in the 3. XenSource had weak project governance within engineering.
control domain is known as X API, because it is the service that Thus, even though there were many people within the company
provides the XenAPI. The service’s primary responsibility is to lis- who felt that using a non-mainstream language was not the right
ten to XenAPI calls (made over the network) and execute these re- decision, the OCaml project started anyway and quickly built
quests. In addition X API itself implements resource pools (dealing momentum as a grassroots effort.
with the distributed systems challenges that this entails), maintains
a durable, replicated persistent database of configuration data on These factors are all non-technical; they created the environment in
behalf of the resource pool and is responsible for high-availability which a product-development initiative based on a non-mainstream
planning and failover1 . The X API source code, consisting of ap- language could be seeded. But there were also technical reasons
prox. 130 KLoc of OCaml, is open source and can be freely down- why OCaml was chosen over other languages for the X API project:
loaded under the LGPLv2 license2 .
One of the defining characteristics of X API is that it communi- 1. Performance: XenSource engineers had used OCaml on previ-
cates with all major components of the system. On the one hand ous projects and were confident that it could deliver the required
it accepts connections from clients (e.g. the XenCenter GUI), per- performance for the project [11].
forming XenAPI requests on their behalf and providing access to a 2. Integration: OCaml’s low-overhead foreign-function interface
variety of data-streaming services (e.g. remote-access to VM con- and existing Unix bindings facilitated the required interac-
soles, importing and exporting VM disk images). On the other tions with the myriad of software components that made up
hand, X API interfaces with other software components within the the XenServer system.
server, including the Xen hypervisor and the networking and stor- 3. Robustness: As a long-running service, X API must not crash.
age subsystems. This requires X API to use a variety of different in- This requirement made OCaml’s static type-safety and man-
terfaces, including (i) calling into statically-linked C APIs to com- aged heap very appealing, offering the potential to reduce run-
municate with the Xen hypervisor and the Linux kernel; (ii) fork- time failures due to type errors, memory leaks or heap corrup-
ing new processes to invoke vendor-specific storage scripts or other tion.
shell commands; (iii) utilising a variety of different IPC mecha-
nisms, for example to communicate with subprocesses involved in 4. Compactness: there were plans for embedded versions of
a live VM migration [3]; and (iv) performing protocol processing XenServer on flash storage as small as 16MB. The relatively
functions over both TCP and Unix domain sockets to receive and simple OCaml run-time and compact native code output were
parse XenAPI requests. key to this requirement.
Another property of X API is that it is highly concurrent. As well
There were other languages that met the above criteria, the most
as managing a number of long-running background housekeeping
notable being Haskell. The primary reason for choosing OCaml
threads, X API accepts and processes concurrent XenAPI requests
over Haskell was non-technical. The engineers involved in the
across multiple connections from multiple clients and deals with
project had considerably more experience of using OCaml, and
communication between the multiple servers and shared storage
using it reduced training costs (this being a luxury in a fast-paced
devices that comprise a resource pool.
startup). Our previous experiences had also given us confidence that
the OCaml tool-chain would meet the project requirements.
2. Authors’ Perspectives
In this section we describe our perspectives of using OCaml within 2.2 Reactions within the company
the context of the XenServer project. We discuss why OCaml was Choosing OCaml for a product development project was a con-
selected, describe the reactions within the company to using a non- tentious decision that created some heated debate within Xen-
mainstream language for product development and relate some of Source. While the engineers in the MTT firmly believed that the
our technical experiences. benefits of using OCaml outweighed the risks, others strongly be-
2.1 Selection of OCaml lieved that the risks of using a non-mainstream language for a major
product development project were simply too great. Specific risks
The XenServer product did not start out within Citrix, but was that were highlighted included:
first conceived within a startup called XenSource. Citrix acquired
XenSource (and hence the XenServer team and product) in 2007. 1. We will not be able to hire OCaml programers quickly enough
There were a number of factors within XenSource that drove the to grow the team.
choice of OCaml and enabled the X API project to reach inception: 2. A large code base in a non-mainstream language will make
1. XenSource was staffed by a number of ex-researchers from the XenSource a less attractive acquisition target.
University of Cambridge Computer Laboratory. Many of these 3. Other teams (staffed with programmers who don’t know OCaml)
engineers had used OCaml before in a research environment will not able to work with the MTT because of “the language
and believed that, for large projects, the OCaml language of- barrier”.
fered significant productivity benefits over both traditional sys-
tems languages such as C, and dynamically typed languages, 4. The OCaml tool-chain may not be mature enough to support the
such as Python [10]. development of a complex system.
2. As a startup, XenSource had a culture of innovation and risk- The MTT had enough experience of using OCaml to argue con-
taking. In this environment there were a number of influential vincingly that Risk 4 could be effectively mitigated. However, at
the time the X API project was initiated, there was no data avail-
1 See http://community.citrix.com/x/O4KZAg able regarding Risks 1—3, so debate (although heated) made little
2 See http://www.xen.org/products/cloudxen.html forward progress.
88
In hindsight, none of the risks above materialised. A year af- Likewise, the lack of high-level profiling data made perfor-
ter work on X API started, Citrix paid $500M for XenSource, and mance tuning harder than it should have been, and made it difficult
the technical due-diligence process performed during the acquisi- to track down memory leaks4 .
tion made it very clear that a large chunk of XenServer was im-
plemented in OCaml. There were also no problems hiring OCaml 2.4 Technical Lessons Learnt
programmers (§3), and other teams were able to work very effec- Over the last four years of commercial OCaml development, we
tively with the MTT (§4). have learnt several technical lessons regarding its use. Some of
these are outlined in this section.
2.3 Technical experiences
We conducted a preliminary user study among the engineering 2.4.1 Stability of Tools and Runtime
group, with a set of open-ended questions designed to elicit individ- In the early days of X API development, we had no idea if the
ual opinions. Overall, the MTT report positive experiences of using OCaml runtime (e.g. the garbage collector) would be robust enough
OCaml on the XenServer project. Without exception, the engineers to support long-running processes like X API that are required to
within the MTT believe that developing X API in OCaml has been a execute continuously for months at a time. We joined the OCaml
success, with the type system and automatic memory management Consortium to offset this risk, providing us with a support channel
being the most widely cited benefits of the language. Engineers also in case bugs arose.
report that they “enjoy programming in OCaml”, particularly em- However, it transpired that the OCaml runtime was remark-
phasising the fact that they believe OCaml allows them to express ably stable. Our automated test system puts X API through 2000
complex algorithms concisely. There is also a shared belief within machine-hours of testing per night, and also runs regular stress
the MTT that, overall, the choice of OCaml has enabled the team and soak tests that last for weeks on end. Customers also run their
to be more productive than they would have been had they chosen XenServers for several months at a time without restarting X API.
a more mainstream language for the project (e.g. C++ or Python). Despite all this testing, we have never had a single XenServer de-
Note that Java and .NET-based languages were not included due fect reported from internal testing or from the field that can be
to the size of their runtime environments not being conducive to traced back to a bug in the OCaml runtime or compiler. (During de-
the ‘compactness’ requirement (§2). These positive experiences are velopment we did once find a minor compiler bug, triggered when
backed up by internal test data and component defect levels that compiling auto-generated OCaml code with many function argu-
demonstrate that the quality and performance of the X API compo- ments, but this was already fixed in the development branch by the
nent is good. time we reported it and so no interaction with the maintainers at
However, despite the overall positive outcome, there have been INRIA was required.)
some technical challenges that relate to the choice of OCaml. These
challenges are not due to the OCaml language per se, but are 2.4.2 The Right Style for the Right Job
due to lack of available library support, the complexity of the
Foreign Function Interface (FFI) and the limitations of the OCaml OCaml allows for many programming techniques to be used in
toolchain. We consider each of these issues in more detail in the the same codebase. X API takes full advantage of this fact, using
remainder of this section. different programming styles to solve different problems:
Imperative Many of the lower-level modules of X API (e.g. those
2.3.1 Lack of Library Support
that interface with the hypervisor and control domain kernel)
We found that OCaml’s library support for common data struc- consist of step-wise, imperative code and look like type-safe C.
tures and algorithms generally sufficient for our needs. However, OCaml fully supports this style with language constructs such
the lack of library support for common systems protocols was more as for/while loops and references.
problematic. In particular we ended up having to write a pipelined
Functional Although a good chunk of X API is unashamedly im-
HTTP/1.1 server from scratch and handcrafting our own SSL so-
perative, some of the higher-level aspects of the system are
lution using separate stunnel3 processes to terminate and initiate functional in nature. For example the high-availability feature
SSL connections, and communicating with these over IPC. requires algorithms for distributed failure planning. These algo-
There were some open source HTTP and SSL OCaml libraries rithms (e.g. bin packing) are implemented in a functional style.
available. However, at the time, the libraries that we evaluated were
not fully featured or robust enough to meet the requirements of the One function of X API is to communicate with Xenstore. The
X API project. Xenstore service, which runs in the control domain, provides a
tuple-space that is used for co-ordination between VMs and the
2.3.2 C Bindings XenServer management tools [7]. Xenstore exposes an asyn-
Writing C bindings was difficult and error-prone. Despite careful chronous event interface that is hard to use. X API abstracts
code-review and a policy of “keeping things simple” (avoiding much of this complexity behind a straight-forward combinator
references into the heap across the FFI, and avoiding use of call- library that handles events via composable functions. For exam-
backs whenever possible) some bugs still crept through, creating ple, consider the following code fragment:
occasional X API segmentation faults that were hard to reproduce wait_for (any_of [
and track down. ‘OK, value_to_appear "/path1"
2.3.3 Lack of Tool Support ‘Failed, value_to_become "/path" v ])
Our heavy use of threads and fork(2) made it impossible for us The expression value to appear "/path1" represents the
to effectively use ocamldebug or ocamlprof. Instead we relied act of waiting for any value to become associated with key
on gdb and gprof directly against the compiled binary. This was "/path1". The expression value to become "/path" v
better than nothing, but the low-level nature of gdb made it hard to
relate the debugging output back to the OCaml source. 4 In a garbage collected language, like OCaml, memory leaks occur when
global references to objects are not cleaned up explicitly (e.g. if something
3 Universal SSL wrapper: http://www.stunnel.org is added to a global hash-table and not subsequently removed).
89
represents the act of waiting for a specific value v to become as- 4. OCaml Code contribution
sociated with key "/path". The expression any of represents As described earlier (§1.1), the XenServer Engineering Group con-
the act of waiting for any one of a set of labelled options; in sists of five teams of full-time software engineers, supplemented by
this example the label ‘OK is used to represent a success case contractors. Each team is responsible for a different software com-
and the label ‘Failed represents a failure case. Finally the ponent. The source code for each component is stored in a num-
function wait for uses the Xenstore event interface, returning ber of version-controlled repositories using Mercurial [14]. Each
either ‘OK or ‘Failed as appropriate. repository contains a complete historical record listing every code
Meta-programming X API has a distributed database that runs change, when it was made, who made it and why. In this section
across all the hosts in a resource pool, including failover and we will examine this historical record to identify and analyse which
replication algorithms. The OCaml code to interface with this teams contributed to which components. We shall use this data to
database and remote calls is all auto-generated from a succinct answer the question:
specification and compiler. Similarly, all of the XenAPI bind-
ings to other languages (C, C#, Java) are generated from a sin- “Did the use of OCaml within the MTT prevent engineers
gle data-model. from other teams making significant contributions to the
X API project?”
Object-oriented OCaml provides a comprehensive object system,
but it is not used in X API except in small, local cases. Although For our analysis we shall focus on four components:
we have nothing specific against using it, a compelling case
1. Management Console: a windows user-interface maintained by
for introducing them has never emerged. Modules, functors
the User Interface team;
and polymorphic variants have been sufficient to date, and we
anticipate that first-class packaged modules (in OCaml 3.12+) 2. Storage: a set of plugin modules to connect XenServer to back-
will further reduce the need for using objects. end storage arrays where VM disks are stored maintained by
the Storage team;
2.4.3 Garbage Collect Everything 3. X API: the component which implements the XenAPI main-
The automatic memory management that OCaml provides is a huge tained by the MTT; and
improvement over using C, but we still frequently get leaks due to 4. Windows drivers: drivers required for high-performance VM
mismatched allocation/deallocation of other limited OS resources, I/O, maintained by the Windows Driver team.
such as file descriptors and shared memory segments. These are
usually only detected after automated stress testing detects the The components were chosen for the following reasons:
failure since the code involved works fine during development.
1. they were all created solely for the XenServer product unlike,
Nowadays, we make an effort to abstract as many of the OS
for example, the open-source Xen hypervisor that was created
resources as possible behind our own extensions to the standard
as part of a research project a few years before the XenServer
library.
product emerged;
2. they are all maintained by different teams; and
3. Hiring Patterns 3. they all primarily use different programming languages (even
Despite concerns raised at the start of the X API project, the MTT the X API code contains traces of C).
has had no difficulty in finding and hiring good OCaml program-
mers, and has been able to grow at a comparable rate to the other The following table gives approximate sizes and primary lan-
XenServer teams that used mainstream languages. From October guage data for each component5 :
2006 to April 2010, 12 engineers have been hired into OCaml-
Component Size Main Languages
programming positions (roughly a quarter of all XenServer engi-
XAPI 130kLOC OCaml
neers hired over the period).
Windows Drivers 80 kLOC C, C++
There are two interesting observations about the MTT’s hir-
Management Console 200kLOC C#
ing patterns. Firstly, we found that posting on functional program-
Storage 40 kLOC Python, C
ming mailing lists (including the OCaml List and Haskell Cafe) has
consistently generated good inflows of high quality candidates in- The diagram in Figure 1 displays four bars, one for each com-
terested in industrial functional programming positions. And, sec- ponent in the analysis. The height of each bar indicates the total
ondly, we have found that previous OCaml experience is not a pre- number of individuals who contributed code to each component.
requisite for hiring into OCaml-programming positions. The bars are subdivided into sections, each one coloured to indi-
In fact, of the 12 engineers hired, only 2 had prior experience cate the team the contributor belonged to.
of OCaml; the other 10 learnt OCaml after they started work at The diagram in Figure 2 displays four bars, one for each compo-
XenSource or Citrix. Interestingly, having to learn OCaml did not nent as before. The bars now represent the relative contribution size
make a big difference to the training time of the new engineers: from members of each team to each component. It is clear that, in
the 10 engineers that did not know OCaml became productive at all cases, the team responsible for maintaining a component makes
about the same speed as the 2 engineers that did have prior OCaml the majority of contributions. However it is also clear that, in all
experience. cases, members of other teams made contributions.
We believe that this is because, for a complex software product The size and colouring of the bar corresponding to X API in
like XenServer, getting to know one’s way around the various Figure 1 clearly shows that the use of OCaml did not prevent
code-bases and getting to grips with the architectural principles engineers from other teams making contributions. Furthermore, the
of the wider system is a much more time consuming task than size and colouring of the bar corresponding to X API in Figure 2
learning a new programming language. The 10 engineers that did
not know OCaml were already highly proficient programmers who 5 The X API number excludes auto-generated OCaml code, the Windows
had a solid grounding in data-structures, algorithms and computer driver excludes header files as most are auto-generated, and the Manage-
science more generally. ment Console excludes auto-generated XenAPI and Windows Forms code.
90
40 Unknown Storage 100%
Windows Drivers Hypervisor/Kernel
35 MTT
User Interface
30
25
60%
20
15
40%
10
5 20%
0
Console
(C#)
Storage
(C,CPP)
Drivers
xapi
(ocaml)
Management
(python,C)
Windows
0%
Console
(C#)
Storage
(C,CPP)
Drivers
xapi
(ocaml)
Management
(python,C)
Windows
Component (primary languages used)
Figure 1. The total height of each bar shows the total number Component (primary languages used)
of unique contributors to each component. The color indicates the
proportion of contributors from each team. Figure 2. Each coloured section indicates the size of contributions
to a component by a team, relative to the total contributions.
91
References [8] T. Gazagnaire and A. Madhavapeddy. Statically-typed value persis-
[1] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, tence for ML. In Proceedings of the Workshop on Generative Tech-
R. Neugebauer, I. Pratt, and A. Warfield. Xen and the art of virtu- nologies, March 2010.
alization. In Proceedings of the 19th ACM Symposium on Operat- [9] F. Le Fessant and S. Patarin. MLdonkey, a Multi-Network Peer-to-
ing Systems Principles (SOSP), pages 164–177, New York, NY, USA, Peer File-Sharing Program. Research Report RR-4797, INRIA, 2003.
2003. ACM Press. [10] A. Madhavapeddy. Creating high-performance, statically type-safe
[2] B. Canou, V. Balat, and E. Chailloux. O’Browser: Objective Caml on network applications. Technical Report UCAM-CL-TR-775, Univer-
browsers. In Proceedings of the 2008 ACM SIGPLAN workshop on sity of Cambridge, Computer Laboratory, Apr. 2006.
ML, pages 69–78, New York, NY, USA, 2008. ACM. [11] A. Madhavapeddy, A. Ho, T. Deegan, D. Scott, and R. Sohan.
[3] C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, Melange: creating a “functional” Internet. SIGOPS Oper. Syst. Rev.,
I. Pratt, and A. Warfield. Live migration of virtual machines. In 41(3):101–114, 2007.
Proceedings of the 2nd Symposium of Networked Systems Design and [12] Y. Minsky and S. Weeks. Caml trading – experiences with functional
Implementation, May 2005. programming on Wall Street. J. Funct. Program., 18(4):553–564,
[4] P. Cuoq, J. Signoles, P. Baudin, R. Bonichon, G. Canet, L. Correnson, 2008.
B. Monate, V. Prevosto, and A. Puccetti. Experience report: OCaml [13] T. Morgan. Citrix desktop virt soars in Q4, Jan. 2010. http:
for an industrial-strength static analysis framework. In ICFP ’09: //bit.ly/ciB74a.
Proceedings of the 14th ACM SIGPLAN international conference on
Functional programming, pages 281–286, New York, NY, USA, 2009. [14] B. O’Sullivan. Mercurial: the definitive guide. O’Reilly Media, first
ACM. edition, 2009.
[5] M. DeBergalis, P. Corbett, S. Kleiman, A. Lent, D. Noveck, T. Talpey, [15] D. Syme, A. Granicz, and A. Cisternino. Expert F#.
and M. Wittle. The Direct Access File System. In Proceedings of [16] J. Yallop. Practical generic programming in OCaml. In Proceedings
the 2nd USENIX Conference on File and Storage Technologies, pages of the 2007 workshop on Workshop on ML, pages 83–94, New York,
175–188, Berkeley, CA, USA, 2003. USENIX Association. NY, USA, 2007. ACM.
[6] J. Donham. OCamlJS, July 2010. http://jaked.github.com/
ocamljs.
[7] T. Gazagnaire and V. Hanquez. Oxenstored: an efficient hierarchi-
cal and transactional database using functional programming with
reference cell comparisons. In ICFP ’09: Proceedings of the 14th
ACM SIGPLAN international conference on Functional programming,
pages 203–214, New York, NY, USA, 2009. ACM.
92