TENOR2015 Proceedings
TENOR2015 Proceedings
PARIS
First International Conference on Technologies for Music Notation and Representation
                                                                        TENOR 2015
                                                                     28-30 May, 2015
                                                    Université Paris-Sorbonne / Ircam
                                                                                 Paris
Proceedings published by :
Institut de Recherche en Musicologie, IReMus
2, rue de Louvois 75002 Paris
ISBN : 978-2-9552905-0-7
EAN : 9782955290507
Editors :
Marc Battier
Jean Bresson
Pierre Couprie
Cécile Davy-Rigaux
Dominique Fober
Yann Geslin
Hugues Genevois
François Picard
Alice Tacaille
Credits :
Nicolas Taffin (Logo design)
Nicolas Viel (Layout design)
All rights reserved. No part of this publication may be reproduced in any form or by any means
without the permission of the publishers or the authors concerned.
Music notation serves the needs of representation, writing and creation. New musical forms such
as electronic and/or interactive music, live coding, as well as the migration of musical instruments
to gestural and mobile platforms, hybridizations with dance, design and multimedia tend to extend
the notion of score in contemporary music, revisiting it through new forms of writing, and spreading
it over different media. Until recently, the support provided by computer music to the field of symbo-
lic notation remained fairly conventional. However, recent developments indicate that the tools for
musical notation are today ready for a move forward towards new forms of representation.
  Musical notation, transcription, sonic visualization, and musical representation are often associa-
ted in the fields of musical analysis, ethnomusicology, and acoustics. The aim of this conference is
to explore these recent mutations of notation and representation in all these musical domains. The
first International Conference on Technologies for Music Notation and Representation is dedicated
to theoretical and applied research and development in Music Notation and Representation, with a
strong focus on computer tools and applications, as well as a tight connection to musical creation.
  The scholarly conference, posters and demo are taking place at Paris-Sorbonne University and
Ircam.
Organizing committee
                                                  i
Organizing committee
Student volunteers
Fabiano Araujo
Elsa Filipe
Martin Guerpin
Marina Maluli
Daniel Rezende
Thomas Saboga
Steering committee
The Steering Committee is responsible for guiding future directions with regards to the TENOR conference.
Its members currently include :
Jean Bresson, Ircam-CNRS UMR STMS
Pierre Couprie, IReMus Université Paris-Sorbonne
Dominique Fober, GRAME
Yann Geslin, INA-GRM
Richard Hoadley, Anglia Ruskin University
Mike Solomon, Ensemble 101
Scientific committee
A Carlos Agon
  Andrea Agostini
                                    G   Jérémie Garcia
                                         Hugues Genevois
                                                                         Peter Manning
                                                                         Tom Mays
Gerard Assayag                                                           Alex Mclean
                                    Yann Geslin
B Karim Barkati
  Marc Battier
                                    Daniele Ghisi
                                    Jean-Louis Giavitto                  O Yann Orlarey
Sandeep Bhagwati                    Gérald Guillot
Andrew Blackburn
Alan Blackwell
Alain Bonardi                       H Georg Hajdu                        P Francois Pachet
                                                                            François Picard
                                      Keith Hamel
Bruno Bossis                        Richard Hoadley
Jean Bresson
                                                                         R Philippe Rigaux
C Elaine Chew
  Michael Clarke                    J Florent Jacquemard
                                       Guillaume Jacquemin
                                                                         S Eleanor Selfridge-Field
Pierre Couprie                                                             Mike Solomon
                                                                         Marco Stroppa
D   Cécile Davy-Rigaux
    Frédéric Dufeu
                                    K    Mika Kuuskankare
E Simon Emmerson
                                                                           Matthew Thibeault
F Dominique   Fober
  Ichiro Fujinaga                   M Thor   Magnusson
                                        Mikhail Malt                     V Anders Vinjar
                                                   ii
Keynote
Eleanor Selfridge-Field
                                                 iii
TENOR 2015
Accretion: Flexible, Networked Animated Music Notation for Orchestra with the          104
Raspberry Pi
K. Michael Fox
                                               iv
TENOR 2015
THEMA: a Music Notation Software Package with Integrated and Automatic      140
Data Collection
Peter McCulloch
SVG to OSC Transcoding as a Platform for Notational Praxis and Electronic   155
Performance
Rama Gottfried
                                            v
vi
   LEADSHEETJS: A JAVASCRIPT LIBRARY FOR ONLINE LEAD
                     SHEET EDITING
                                ABSTRACT                                       e.g. to share scores by exporting them to the web [9]. The
                                                                               open-source desktop-based editor MuseScore 1 provides
Lead sheets are music scores consisting of a melody and
                                                                               features for sharing scores but does not provide directly
a chord grid, routinely used in many genres of popular
                                                                               online editing. There are many online tools to edit and
music. With the increase of online and portable music
                                                                               view scores, but they do not rely on web standards, and
applications, the need for easily embeddable, adaptable
                                                                               often require the installation of a plugin on the web-
and extensible lead sheet editing tools is pressing. We
                                                                               browser. Some tools, such as NoteFlight 2 , Scorio 3 or
introduce LeadsheetJS, a Javascript library for
                                                                               Flat.io 4 , do follow standards and produce machine-
visualizing, editing and rendering lead sheets on multiple
                                                                               readable scores, but they are not designed specifically for
devices. LeadsheetJS provides lead sheet editing as well
                                                                               lead sheets. For instance, they do not support chord
as support for extensions such as score augmentation and
                                                                               notations, an important feature of a lead sheet.
peer feedback. LeadsheetJS is a client-based component
                                                                                  Besides offering basic score editing services, online
that can be embedded from arbitrary third-party websites.
                                                                               lead sheet tools should provide features for augmented
We describe the main design aspects of LeadsheetJS and
                                                                               editing, e.g. to be tailored to pedagogical or social
some applications in online computer-aided composition
                                                                               contexts. The ability of adding heterogeneous graphic
tools.
                                                                               objects such as colored layers, text or images, is crucial to
                                                                               enable collaboration between users as a way for giving
                         INTRODUCTION
                                                                               feedback on certain parts of the score. INScore [4]
A lead sheet is a specific type of music score consisting                      supports various graphical objects, but is not easily
of a monophonic melody with associated chord labels                            embeddable in an online application and it is more
(see Figure 1). Lead sheets are routinely used in many                         focused on real-time rendering of interactive music scores
styles of popular music such as songwriting, jazz, pop or                      [6] for new forms of composition and performance.
bossa nova.                                                                       This paper presents LeadsheetJS a Javascript library
   With the rise of online music communities using                             for storing, visualizing, playing, editing and making
performance or pedagogical applications, there is an                           graphical annotations on lead sheets. In the following
increasing need for tools for manipulating music scores.                       section we describe the main features of the library. Then
In this context, music notation takes an important role,                       we give some hints about its implementation. We finally
and in particular lead sheets, which are the main form of                      describe tools built on top of this library.
score for popular music. There is also a need for web-
based tools for visualizing, playing, and editing lead                                              LEADSHEETJS
sheets collaboratively. Such tools should also work on
                                                                               LeadsheetJS is a Javascript library for lead sheets. It
various devices, following the trend in using web
                                                                               enables the edition and visualization of lead sheets under
applications on mobiles and tablets. Finally, these tools
                                                                               conventional formats, as well as rendering, playing and
should intercommunicate easily with other tools, e.g. by
                                                                               storing lead sheets in a database. Figure 2 shows how
being embeddable in third-party websites.
                                                                               LeadsheetJS interfaces with the player, the menu for
   The most popular score editors, Finale and Sibelius,
                                                                               editing and the rendered leadsheet.
are designed as desktop applications. As such they cannot
                                                                                  LeadsheetJS provides tools for users to collaborate and
be used online, even though cloud features can be added,
                                                                               give feedback to each other by highlighting certain parts
Copyright: © 2015 Daniel Martin et al. This is an open-access article
distributed   under    the   terms   of   the   Creative     Commons             1
                                                                                   http://musescore.org/
Attribution           License    3.0      Unported,     which   permits          2
                                                                                   http://www.noteflight.com
                                                                                 3
unrestricted use, distribution, and reproduction in any medium, provided           http://www.scorio.com/
                                                                                 4
                                                                                   https://flat.io/
the original author and source are credited.
                                                                           1
of the lead sheet and commenting or suggesting                               Music composition, as well as music learning, is a
modifications. LeadsheetJS has been implemented in                        domain in which feedback on pieces being composed
Javascript, the main programming language for web                         plays a major role. Feedback is traditionally provided by
browsers. This makes LeadsheetJS web-friendly and                         a teacher. Nowadays, on-line learning websites provide
easily embeddable in third-party sites, as well as                        tools for peer-feedback in which learners can produce and
adaptable to several devices.                                             review feedback made by peers.
   In the next sections we describe the main features of                     The possibility of giving feedback on the audio
LeadsheetJS and we give a detailed explanation about the                  representation of a piece of music has been addressed in
main design and implementation aspects.                                   previous works, e.g. [19, 20]. However, by commenting
                                                                          on pure audio, i.e. on a rendered waveform, users are
                                                                          limited to commenting on given time spans, whereas by
                                                                          commenting on a lead sheet, users can refer directly to
                                                                          the musical elements making up lead sheets, such as
                                                                          notes, chord labels, chord transitions, bars or structural
                                                                          elements (see Figure 3).
                                                                      2
2    Embeddability
Arbitrary websites can render lead sheets by importing
the LeadsheetJS library in the HTML source code. New
lead sheets can be created or imported and rendered and
edited from the site. As an example we show a website in
the MusicCircle platform [19], displaying the lead sheet
Blue Room by Rodgers & Hart (see Figure 4).
   First, the LeadsheetJS library is imported in the HTML
page. Then, the lead sheet of Blue Room is imported from
a database (LSDB, described later) in our JSON lead
sheet format through the LSDB API, which allows
external sites to retrieve lead sheets. Finally, the JSON
                                                                  Figure 6. LeadsheetJS on a 1024x768 tablet.
text is converted to a LeadsheetJS object and displayed in
the page (see Figure 5).
                                                                  4    Audio wave visualization
                                                                  LeadsheetJS does not handle only symbolic information.
                                                                  Recordings of the performance of a lead sheet can also be
                                                                  associated to the lead sheet. LeadsheetJS provides
                                                                  visualization of the recording’s waveform synchronized
                                                                  with the lead sheet, so that on top of each measure, the
                                                                  waveform of the recording part corresponding to that
                                                                  measure is displayed (see Figure 7). This feature is useful
                                                                  for musicians who record themselves performing a given
                                                                  lead sheet. They can then listen to their performance and
                                                                  see at the same time the lead sheet and the audio
                                                                  representation.
3    Multi-device
Web applications are not accessed only from a desktop
computer but also from tablets and mobile phones:                 Figure 7. LeadsheetJS visualizing Solar, by Miles Davis, and audio
responsive web design has become essential for designing          recording displaying
web applications. To that aim, LeadsheetJS resizes
automatically scores depending on the width of the                5    Design
screen. This way it can be visualized in devices with             LeadsheetJS is a complex library that provides many
different screen sizes such as tablets or mobile phones           functionalities (editing, visualizing, playing, storing).
(see Figure 6).                                                   From an architectural point of view, it needs to be
                                                                  maintainable, scalable and extensible. Furthermore,
                                                                  modularity is required as users may need to use only
                                                                  certain features of LeadsheetJS. For example, a music
                                                              3
blogger may want to visualize and play lead sheets in her           MongoDB database). The Ajax module is in charge to
blog without allowing edition or audio visualization.               send requests to the server. For example, in order to store
    The design of LeadsheetJS is module-based. It is                a lead sheet in a database the Database module will send
inspired by Zakas’ architecture [21] in which every                 the data to the server as an HTTP request through the
module is an independent unit that does not need any                Ajax Module.
other module to work. Zakas’ architecture is based on the              The core module, Leadsheet Model, represents a lead
MVC (Model-View-Controller) architecture. Every                     sheet. A lead sheet consists of a melody that is in most
module has its own model, view and controller classes.              cases monophonic, and a chord label grid representing the
Each module is composed of a set of classes. There is one           harmony. From a structural point of view, a lead sheet is
file per class. In total LeadsheetJS contains about 150             a hierarchical structure composed by sections, which are
classes.                                                            composed of bars, which in turn are formed by a list of
    LeadsheetJS is a client-based Javascript library, i.e. it       notes (a melody), and a list of chord labels. Each of these
runs in the browser. However, certain functionalities               levels defines specific attributes: at the top level, the lead
require communication with a server or a database, such             sheet has a composer, a title, a style as well as musical
as storing or retrieving lead sheets. Databases and servers         attributes such as global key and time signature. Section
are not part of LeadsheetJS, yet it provides modules to             related information attributes are section name, number
communicate with them.                                              of bars, number of repetitions and number of endings.
    The architecture scheme is shown in Figure 8. The               Bars may also have specific time or key signature
central module is Leadsheet Model. All modules depend               changes, as well as structure labels like coda or segno.
on it since they need it in order to work. Modules Viewer,          Finally, the lowest levels of the hierarchy are notes and
Player and Interactor provide visualization, playing and            chord labels.
edition functionalities respectively. The Annotation
module provides graphic annotation for peer feedback
purposes. The Format exporter/importer modules is a
converter to various formats so that the represented lead
sheet can be sent to (or received from) other applications.
The Ajax module facilitates the communication to a
server. Therefore, it is used by the modules that depend
on a server: the Data Base module, which is in charge of
storing the lead sheet to a database in a given format, and
the modules that are analysis tools which we describe in
section 3.
                                                                    Figure 9. Example of a client-server database structure using
                                                                    LeadsheetJS.
                                                                4
graphics dynamically. The Viewer uses Vexflow5, a low             degrees of the chord type relative to the root pitch. E.g.:
level score rendering Javascript library. Vexflow                 for C# maj7, notes are C#-E#-G#-B#. The player plays
addresses low level rendering of notes and staves,                them arbitrarily in the 4th octave, so MIDI notes are 61-
whereas LeadsheetJS specifies what to draw in each bar            65-68-72. Other more refined MIDI players can easily be
as well as other higher level tasks such as determining           defined by the user.
how many bars to display per line.
                                                                  5.4       Javascript Module Management
5.2       Interactor                                              As a client-based application, LeadsheetJS runs on the
The Interactor component provides the editing part by             browser, so each Javascript file needs to be imported in
using the library JQuery 6 which, among many other                the HTML source code through the script tag. This may
things, takes care of event handling. Keyboard and mouse          be an issue as we need to include explicitly each file and
events are caught by the Interactor to perform desired            there are around 150 classes, while not all classes are
transformations on an edited lead sheet. We introduce             always needed. For example, an instance of LeadsheetJS
three levels of edition: notes, chord labels and bars. Note       could only show a lead sheet and play it: in that case
edition works like in any traditional score editor. Chord         there is no need for editing, so the Interactor module does
label edition provides specific interaction schemes such          not need to be loaded. To optimize loading time, and
as completion to suggest the most relevant chord types in         ensure only needed modules are loaded, LeadsheetJS uses
a given context (see Figure 10). LeadsheetJS contains a           RequireJS 8, a tool to manage dependencies in Javascript.
comprehensive database of over 300 chord types,                       In order to provide communication between modules
collected during the process of a lead sheet database             in an uncoupled way we make an intensive use of the
compilation described in section 3.1.                             Mediator design pattern [12]. The Mediator pattern
                                                                  encapsulates the way different modules interact. It
                                                                  enables a module to subscribe to an action of another
                                                                  module which publishes it.
                                                                      For example, when the Leadsheet Model module
                                                                  changes the pitch of a note, it publishes that action; that
                                                                  is, it sends a message to a mediator telling that the note’s
                                                                  pitch has changed. The mediator checks which modules
                                                                  are interested in the action of note pitch changed; that is,
                                                                  which modules are subscribed, and informs them. This
                                                                  way, the Viewer module, which is subscribed to note
                                                                  pitch changed, knows it must redraw the score.
                                                                      The advantage of using this pattern is that Leadsheet
                                                                  Model and Viewer do not communicate directly, which
                                                                  brings to uncoupled code, thus, more scalable and
                                                                  maintainable.
Figure 10. Chord label completion to speed up edition.
                                                                  5.5       Javascript implementation
5.3       Player
                                                                  Javascript is a prototype-based language rather than a
LeadsheetJS provides a MIDI Player which uses the                 class-based one like C++ or Java. In order to define
library MidiJS7 to play a lead sheet, i.e. both the melody        classes, there are mainly two approaches: to use Object
and the chord labels. The chord labels are transformed            literals or to use prototypes. By using object literals to
into MIDI chords.                                                 define classes one can use private variables by using the
   The chord labels are represented by a pitch and a chord        Module Pattern [12]. The Module Pattern takes
type. E.g.: in C# maj7, C# is the pitch and maj7 the chord        advantage of closures to simulate private variables, which
type. The chord type database provides information about          are not natively supported in Javascript. On the other
the note degrees for each chord type. For instance for            hand, using prototypes to define classes one cannot
maj7 the degrees are I, III, V and VII.                           emulate private variables, but this approach has the
   In order to play chords, LeadsheetJS transforms chord          advantage that it is less memory consuming, since all the
labels into sets of MIDI notes by calculating the notes           methods of all instances of a class share the same
                                                                  memory. We have mainly used the Prototype approach as
   5
     http://www.vexflow.com
   6
     http://jquery.com/
   7                                                                8
     http://mudcu.be/midi-js/                                           http://requirejs.org/
                                                              5
we are using multiple instances of many classes such as
NoteModel or ChordModel.
    9                                                                  10
        What You See Is What You Get                                        http://mongodb.com/
                                                               6
                                                                    The Harmonizer tool, given a monophonic melody,
                                                                 proposes a multi-voice harmonization in a given style.
                                                                 E.g.: one can harmonize the melody of Coltrane’s jazz
                                                                 standard Giant Steps in the style of Wagner or Bill Evans
                                                                 [14].
2 Automatic Feedback on lead sheets Figure 13. A chord sequence analyzer grafted on top of LeadsheetJS.
                                                             7
where LeadsheetJS presents it in the User Interface as a
time-line map.
3        Flow Composer
In the context of the Flow Machines11 project about style
imitation, an online composition tool called Flow
Composer was designed, to help a composer generate a
lead sheet using different “styles”. Again, styles are
defined by corpus of songs taken from the Lead sheet
Database.
    The main idea is that a composer can start to create a
song and leave some empty measures in which there will
be only silences. Then, he queries the system to fill those
blanks in a given style. Those blanks can be on the
melody, represented by silences, or on the chord grid,
represented by No Chords (NC). The system will
generate a melody or chord labels to fill them taking into          Figure 16. Flow Composer completion in blue.
account the style chosen by the user, and also constraints
of continuity. Composers usually don’t want a whole new             4    Experiment on feedback in composition
random song; they rather want the system to help them
                                                                    PRAISE12 (Practice and Performance Analysis Inspiring
with certain parts of their composition. The composer
                                                                    Social Education) is a social network for music education
can accept or reject all or part of the system’s proposition.
                                                                    with tools for giving and receiving feedback in online
Flow Composer tools allow composers to have at any
                                                                    communities. In the context of PRAISE we have built a
moment a full control on the lead sheet: there is a history
                                                                    tool for feedback in composition in which composers can
feature in which every step is saved, so they can go back
                                                                    compose a lead sheet and share it with other composers
to a previous state.
                                                                    who can then provide feedback. This tool is based on the
    Flow Composer is built on top of LeadsheetJS and
                                                                    annotation module of LeadsheetJS.
uses the same modular approach. LeadsheetJS is used in
                                                                       In the PRAISE project, we designed an experiment to
Flow Composer to listen, view and edit lead sheets. We
                                                                    determine the impact of feedback in lead sheet
show in Figure 16 how Flow Composer works. In the
                                                                    composition [10]. We evaluate whether musical peer
first image (on the top) a user is composing a bossa-nova.
                                                                    feedback, just like in the example explained in section
In the song there are two parts. The second part starts at
                                                                    2.1, may actually improve or not the musical quality of a
measure 7 (with note F and chord F7) and is not shown in
                                                                    composition. In a first phase, participants are asked to
the figure. The second part is ok, but the composer does
                                                                    compose a short song (8 bars). In the second phase they
not know how to finish the first part so that it transitions
                                                                    are invited to suggest modifications of other participants’
well to the second part. So he leaves it empty with
                                                                    compositions. Then participants are asked to reconsider
silences and no chords (NC), and queries Flow Composer
                                                                    their original song and try to improve it. The point is that
to fill the empty part in the bossa-nova style. The second
                                                                    a group of subjects will have received feedback whereas
image (on the bottom) shows the result proposed by Flow
                                                                    another group will have not. We then evaluate to which
Composer: it has filled the empty part by proposing a
                                                                    extent the quality of the improved composition of those
melody and a chord grid. Interaction may then proceed by
                                                                    subjects who received is better than that of those who did
accepting parts of the suggestions and/or querying other
                                                                    not. The quality evaluation is estimated from a listening
solutions.
                                                                    panel. LeadsheetJS was used to implement this
                                                                    experiment, including modules for editing and playing for
                                                                    the composition phase and the Annotation module for the
                                                                    feedback phase.
                                                                       The composer of the lead sheet can later review
                                                                    suggestions and accept them or not.
                                                                       The feedback process is illustrated as follows. First,
                                                                    user Bruno composes a song and edits it with
    11
         http://www.flow-machines.com                                   12 http://www.iiia.csic.es/praise/
                                                                8
LeadsheetJS. Later, user Silvia looks at it and plays it.                      feedback. We have illustrated how LeadsheetJS is used in
She decides to make some suggestions on certain notes.                         several online music applications.
As shown in Figure 17 once she has saved the suggestion,                          LeadsheetJS addresses the needs of online applications
she can perform other actions, shown in the contextual                         for composing, generating, sharing or teaching music on-
menu :                                                                         line. New features are currently investigated such as
                                                                               multiple voices management, lyrics, audio based player,
-    Add Comment: add an explanation of her musical
     suggestion,                                                               as well as rendering lead sheets using style-based
                                                                               accompaniment generation systems.
-    Upload sound: upload a sound recording related to
     the suggestion,
-    Modify: she can decide to modify the suggestion she
     just saved,
-    Remove: remove the suggestion.
                                                                               Acknowledgments
Figure 17. A user makes a suggestion on a specific part of a lead sheet.
                                                                                 This work is supported by the Praise project (EU FP7
   Later on, Bruno can review all suggestions by
                                                                               number 388770), a project funded by the European
switching between the original elements and suggested
                                                                               Commission under program FP7-ICT-2011-8.
ones and listen to them. Figure 18 shows a lead sheet
with three suggestions. Bruno clicks on one of them to
                                                                                                        REFERENCES
see the associated explanation.
                                                                               [1] D. Crockford, “The json data interchange format”.
                                                                                   Technical report, ECMA International, October
                                                                                   2013.
                                                                               [2] C. Daudin, D. Fober, S. Letz, Y. Orlarey, “The
                                                                                   guido engine a toolbox for music scores rendering.”
                                                                                   In Proceedings of the Linux Audio Conference,
                                                                                   2009, pp. 105–111.
                                                                               [3] G. Dyke, P. Rosen, abcjs–Project Hosting on Google
                                                                                   Code, 2010.
                                                                               [4] D. Fober, S. Letz, Y. Orlarey, F. Bevilacqua,
                                                                                   “Programming Interactive Music Scores with
                                                                                   INScore”. In Proceedings of the Sound and Music
                                                                                   Computing Conference, 2013 july, pp. 185-190.
Figure 18. A user checks the suggestions received.
                                                                               [5] D. Fober, Y. Orlarey, S. Letz, “Scores level
   Finally, if Bruno likes the suggestion he can accept it
                                                                                   composition based on the Guido Music Notation”.
so that the suggestion is merged with the whole song by
                                                                                   Ann Arbor, MI: MPublishing, University of
right-clicking on the suggestion (see Figure 19).
                                                                                   Michigan Library, 2012.
                         CONCLUSION
                                                                               [6] D. Fober,    Y. Orlarey,   S. Letz,   “Augmented
   We have presented LeadsheetJS, a Javascript library                             Interactive Scores for Music Creation.” In Korean
for lead sheets. By design, LeadsheetJS is compatible                              Electro-Acoustic Music Society's 2014 Annual
with multiple devices and easily embeddable.                                       Conference, 2014 october, pp. 85-91.
LeadsheetJS also provides various tools for music
composition such as automatic analysis and peer
                                                                           9
[7] M. Good, “MusicXML: An internet-friendly format                 [15] F. Pachet, J. Suzda, D. Martín, “A Comprehensive
    for sheet music.” In XML Conference and Expo,                        Online Database of Machine-Readable Lead-Sheets
    2001, pp. 3-4.                                                       for Jazz Standards”. In ISMIR, 2013, pp. 275-280.
[8] T. Hedges, P. Roy, F. Pachet, “Predicting the                   [16] P. Roland, The music encoding initiative (mei). In
    Composer and Style of Jazz Chord Progressions.”                      Proceedings of the First International Conference
    Journal of New Music Research, 43(3), 2014,                          on Musical Applications Using, Vol. 1060, 2002, pp.
    pp. 276-290.                                                         55-59.
[9] J. Kuzmich, “The two titans of music notation.”,                [17] I. Stravinsky, R. Craft, Dialogues. London: Faber
    School Band & Orchestra magazine, 2008                               and Faber, 1982.
    september.
                                                                    [18] M. Solomon,    D. Fober,   Y. Orlarey,    S. Letz,
[10] D. Martín, B. Frantz, F. Pachet, “Assessing the                     “Providing Music Notation Services over Internet”.
     impact of feedback in the composition process: an                   In Proceedings of the Linux Audio Conference,
     experiment in lead sheet composition.” In Tracking                  2014.
     the Creative Process in Music, Paris, 2015, October.
                                                                    [19] M. Yee-King, M. d'Inverno, P. Noriega, “Social
[11] H. W. Nienhuys, J. Nieuwenhuizen, “LilyPond, a                      machines for education driven by feedback agents”,
     system for automated music engraving.” In                           in Proceedings First International Workshop on the
     Proceedings of the XIV Colloquium on Musical                        Multiagent Foundations of Social Computing,
     Informatics (XIV CIM 2003) , 2003, pp. 167-172.                     AAMAS-2014, Paris, 2014.
[12] A. Osmani, Learning JavaScript Design Patterns.                [20] M. Yee-King, M. d'Inverno, “Pedagogical agents for
     O'Reilly Media, Inc., 2012.                                         social music learning in Crowd-based Socio-
                                                                         Cognitive Systems”, in Proceedings First
[13] F. Pachet, “Surprising harmonies.”, International
                                                                         International Workshop on the Multiagent
     Journal of Computing Anticipatory Systems 4, 1999.
                                                                         Foundations of Social Computing, AAMAS-2014,
[14] F. Pachet, P. Roy, “Non-conformant harmonization:                   Paris, 2014.
     the real book in the style of take 6.” In International
                                                                    [21] N. Zakas,    Scalable    “Javascript   Application
     Conference on Computational Creativity, Ljubljiana,
                                                                         Architecture”.
     2014.
                                                                      Slides: http://cern.ch/go/Cl6S.
                                                               10
   BIGRAM EDITOR: A SCORE EDITOR FOR THE BIGRAM NOTATION
ABSTRACT
                                                                           11
   • Binary Keyboards, layout-modified keyboards with
     high resemblance to the Bigram Notation
                2. BIGRAM NOTATION
                                                                             Figure 2. Bigram notation. The A Major scale
2.1 Notation vs. Tablature
Traditional keyboard layout and conventional notation sys-
tem share the inner structure of white keys - non-accidental
notes (and, of course, full considered note names); there-
fore, conventional notation might be considered a special
interpretation of keyboard tablature.
  Parncutt [1] introduces the idea that, for beginners, tab-
lature notation might be the most appropriate, due to its
easiness. However, experimented interpreters might prefer
conventional notation, for its resemblance with our bidi-
mensional perception of pitch and time.                               Figure 3. Bigram notation. The chromatic scale starting on A
  This fact gives us the opportunity to explore a new ap-
proach to musical notation. What if we could design a no-            3).
tation that could resemble clearly the pitch-time graph, but         This approach causes the intervals to be color-consistent,
at the same time be an explicit representation of the fin-           making very explicit the inner structure of melodies and
ger positions in the keyboard? Such a system would be,               harmonies, and emphasizing intervalic reading [7]. In ad-
according to Parncutt, convenient for both beginner and              dition, it reduces the amount of required staff lines, facil-
expert musicians, and would provide a faster learning pro-           itating note identification and minimizing cognitive over-
cess.                                                                head.
  In order to reach that goal, a convenient keyboard layout            Notice that, in Figure 2, the semitone structure of the
should be designed. This keyboard will be discussed in               Major scale become self-evident. Furthermore, the Listz’s
Section 3.                                                           excerpt (Figure 4) clearly reveals its structure: symmetric
                                                                     parallel chromatic movements, maintaining the voice’s in-
2.2 Bigram
                                                                     tervalic relationships.
As a consequence of the previous idea, we developed the                The bigram pitch structure itself can be seen therefore
Bigram Notation. It takes its name from the fact that, in the        as a combination of 6-6 black & white notehead systems
staff, each octave presents only two equidistant lines, sep-         (such as Isomorph Notation by Tadeusz Wójcik or 6-6 Klavar
arated a tritone. Consequently, we preserve the octave pe-           by Cornelis Pot), with systems with staff lines separated a
riodicity, and minimize the cognitive overhead of counting           tritone (MUTO Notation by MUTO Foundation or Express
lines to identify the note (both desired criteria from [5]).         Stave by John Keller, 2005) [3].
  Figures 2 and 3 show the A Major scale and the chromatic
                                                                     2.2.2 Rhythm representation
scale, respectively, written in bigram notation.
  Figure 4 shows the same excerpt from Figure 1 in bigram            Regarding the rhythmic notation, we opted for a represen-
notation.                                                            tation that preserves the time-distance proportionality, as
                                                                     suggested in the MNP criteria [5]. As in conventional no-
2.2.1 Pitch representation
                                                                     tation, time is divided into bars. Each bar has a number of
One of the most predominant characteristics of the bigram            pulses, which have a number of divisions. Bars, pulses and
notation is the pitch representation by black and white note-        divisions are represented by vertical lines, whose width is
heads. The A note was (arbitrarily) chosen to be repre-              proportional to their position in the time hierarchy.
sented over the first line, and to be black. When ascending            As an example, the scale in Figure 2 occupies one whole
in the chromatic scale, each new note presents a different           bar, with four pulses and two pulse divisions. The notes
color, alternating white and black noteheads (as in Figure           are placed in each one of the 8 bar divisions.
                                                                12
                                                                               Figure 5. Binary MIDI keyboard prototype
Figure 4. Bigram notation example from Franz Liszt’s ”Hungar-
ian Rhapsody 2”
The bigram system fulfills each one of the seventeen de-                          Figure 6. Binary melodica prototype
sign criteria for notation design established by the MNP.
We must highlight that, although its development is sub-
                                                                      the A notes are situated over the main staff line, and there-
ject to continuous evaluation, the potential changes that
                                                                      fore used as a reference.
might occur will not change radically the basic ideas ex-
                                                                        The authors are currently investigating the appropriate-
posed here.
                                                                      ness of introducing tactile feedback cues, such as using
  Regarding further extensions of the concept, the authors
                                                                      different material or introducing marks. The tritone note
are investigating a compact and adequate way of represent-
                                                                      (D#), which occupies the central line in the staff, might
ing harmony within the bigram context. Due to the interest
                                                                      also present a distinction.
of the authors on jazz, the research is focused on the most
                                                                        Those tactile feedback cues might be helpful both for vi-
common 4-note chords and its variations.
                                                                      sually impaired people, and for experienced players, which
                                                                      might need to know their hands position without looking to
            3. MODIFIED INSTRUMENTS                                   the keyboard (as experienced conventional piano players
3.1 Binary Keyboards                                                  usually do using the cues of black keys’ absence).
                                                                        From the first insight into the binary keyboard layout, it
As already mentioned in Section 2.1, one of the strengths
                                                                      is possible to become aware of one of its main benefits.
of the bigram notation is that it relies on the existence of
                                                                      Since it is isomorphic, there only exists two different posi-
keyboards with high resemblance to the written notation.
                                                                      tions for playing any passage - starting on a white key, or
With such instruments, it would be even possible to play
                                                                      starting on a black key. This fact highly contrasts with the
a bigram notation score without knowing which notes are
                                                                      12 potentially different positions in conventional layouts.
being represented (even though this practice is not recom-
mended).
                                                                      3.2 Similar approaches
  The authors are investigating on the prototype and fab-
rication of such keyboards, which are referred as binary              The presented binary keyboard layout is not a new con-
keyboards. Figures 5 and 6 show two current working pro-              cept; first references to the idea appeared in 1859. In his
totypes: a MIDI controller and a melodica, respectively.              book [8], K. B. Schumann presented his binary keyboard
We believe that, even if the binary keyboard layout differs           proposal, in a chapter called ”Das natürliche Sytem” (”The
completely from standard layout, conventional piano play-             natural system”). He also described there an alternative no-
ing techniques might be applied to binary keyboards, since            tation system based on a chromatic approach. In the same
both layouts share the two-rows key disposition.                      year, A. Gould and C. Marsh patented the binary keyboard
  The A notes are presented in the keyboards with a differ-           in the USA [9], with the name ”Keyboard for Pianos”.
ent color. This fact mimics the bigram notation, in which               Bart Willemse gathers in his website [10] some other his-
                                                                 13
toric binary keyboard proposals, which he calls ”Balanced            are shown as potentially compatible with alternative nota-
Keyboards”.                                                          tions: Finale and LilyPond.
  Another relevant approach can be found in 1882 in the                Finale [17] is a well known score editor. The MNP ex-
Janko keyboard [11], which featured several rows of iso-             plains the method created by John Keller to convert be-
morphic keys. Among others, it did not succeeded com-                tween notation systems [16], by using staff templates. There-
mercially because of the lack of written material, due to            fore, it would be possible to create a bigram template, which
the reticence of publishers (motivated in turn by the musi-          might have a very low developing cost, and use it for our
cians’ reticence) [11, 12].                                          purpose.
  The Chromatone [13] is a modern, digital revision of the             However, in our opinion, Finale has some drawbacks.
Janko keyboard.                                                      The most important of them is that it is proprietary soft-
  The Tri-Chromatic Keyboard Layout [14] is a layout de-             ware. We believe that a project such as the Bigram Edi-
signed by R. Pertchik, and implemented in his vibraphone.            tor, constantly evolving and with a high educational value,
The layout is identical to the binary keyboard, excepting            should be freely available and customizable - in other words,
for the colors. Three different alternate colors are present,        free software. Finale’s platform dependency is also a dis-
highlighting the minor third intervals (and, consequently,           advantage. Furthermore, its price ($600, $350 for students)
the three diminished chords).                                        makes it potentially prohibitive.
  We must also mention the Dodeka approach [15]. As in                 The other proposed alternative is LilyPond [18]. It is
our research, Dodeka presents a notation system together             an original, WYSIWYM approach to score edition. Lily-
with a modified keyboard. The notation system follows a              pond is highly flexible, and thus it is possible to define the
regular pitch-space configuration, with 3 lines per octave.          score’s appearance, allowing the usage of alternative nota-
The keyboard is a representation of the notation system,             tions. In addition, it is a muliplatform, free software editor.
with colour references each major third. However, all keys             Nevertheless, the text-based approach to score edition of
are placed in a single row, which might complicate playa-            Lilypond might represent a big usability problem for those
bility and standard keyboard techniques adoption.                    not used to code or WYSIWYM interfaces. The Bigram
                                                                     Editor should encourage users to create music as soon as
                                                                     possible, minimizing the time spent on learning how to use
3.3 Conventional instruments
                                                                     the software.
Despite the close resemblance of bigram notation with bi-
                                                                     4.1.2 Design considerations
nary keyboard, the notation is potentially suitable to all
kind of conventional instruments. Isomorphic instruments,            Therefore, we opted for implementing our own custom Bi-
such as orchestral strings, might appear beforehand as the           gram Editor. Despite the increase in work load, the deci-
most accessible instruments for bigram notation, due to              sion gave us the opportunity to fully adapt the software to
their intrinsic representation of pitch and intervals. How-          our needs. The established design criteria were the follow-
ever, any other instrument might be potentially capable of           ing:
performing bigram scores, if the relationship between no-
                                                                        • WYSIWYG paradigm metaphor for creation and edi-
tation and instrument notes is known.
                                                                          tion of scores, in order to facilitate its usage
                                                                14
         Figure 7. Bigram Editor : arrangement view
4.2 Features
                                                                                Figure 8. Bigram Editor : edit view
The main interaction window is called the Arrangement
View (see Figure 7) . It provides a general overview of the
score in a multi-track sequencer style. Users can access            4.2.2 Reproduction
from here to all available functionalities.
                                                                    The Arrangement Window provides play/stop and loop re-
                                                                    production controls; these are managed by the reproduc-
4.2.1 Tracks and regions
                                                                    tion bar and the loop bar (vertical red and blue lines in
The musical material is organized into tracks or voices.            Figure 7, respectively).
Through the menus, the user can create, duplicate or delete           Sound is not synthesized by SuperCollider. Instead of
tracks. For each track, following controls are provided:            that, the score is translated to MIDI and streamed in real-
                                                                    time to a MIDI synthesizer, which is platform-dependent.
   • Track ID number                                                Currently, the system is using FluidSynth [22] for Linux,
                                                                    and default internal synthesizers for Windows and OSX.
   • Record/solo/mute controls
                                                                    4.2.3 File managing
   • MIDI instrument selector
                                                                    The Bigram Editor provides file save and load functions.
                                                                    The score state is translated into a simple and custom de-
   • Panning and volume controls
                                                                    scription file based in XML. These files are generated au-
                                                                    tomatically in the temporary folder every time a change in
  Inside each track, users might place regions. A region is
                                                                    the score occurs; the undo/redo functions are built upon
the structural element containing the notes. Three different
                                                                    this functionality.
tools are available for region managing:
                                                                      Furthermore, it is possible to import multi-track MIDI
Pointer Select a region and open the Edit View.                     files from the menu in the Arrangement Window.
Rubber Delete a region                                              In this paper, we presented the basis of the Bigram Nota-
                                                                    tion, and the holistic approach to our alternative notation
 Furthermore, it is possible to move, duplicate, merge and          considering the notation theory itself, the modified key-
ungroup regions, through the mouse actions and/or the menus.        boards, and the score editor.
 The Edit View (Figure 8) provides access to edit the music           Several experiments might be run in order to assess the
material. Users can insert, delete, duplicate or move notes         usability of the Bigram Editor, in terms of Human-Computer
using the Input (I) and Edit (E) controls. A binary key-            Interaction. However, its usefulness is provided by the fact
board reference is shown at the left margin of the score,           that it is currently the only available score editor for the
along with the octave number.                                       bigram notation.
                                                               15
  The authors have received good preliminary qualitative               [5] The Music Notation Project, ”Desirable Criteria
impressions from individual users that already started study-              for Alternative Music Notation Systems” [online],
ing with the bigram system, using the software and the                     http://musicnotation.org/systems/
binary keyboards. Those impressions were specially re-                     criteria/ (Accessed: January 2015)
markable in the case of people with few or very limited
                                                                       [6] R. Parncutt, ”Psychological testing of alternative
previous musical background or keyboard skills. We must
                                                                           music notations” (Research project ideas for students
remark that due to the current limited availability of binary
                                                                           of systematic musicology and music psychol-
keyboards, these test experiences cannot still be carried in
                                                                           ogy), [online] 2014, http://www.uni-graz.
a regular basis.
                                                                           at/˜parncutt/fk5_projekte.html#
  In the near future, an experimental case-study is planned,
                                                                           Psychological_testing_of_alternative
in order to evaluate the learning curve and the acquisition
                                                                           (Accessed: January 2015).
of musical skills in beginners, using the the bigram nota-
tion. That experiment would be a variant of the Parncutt’s             [7] The Music Notation Project, ”Intervals in 6-
proposal [1], which has never been carried out. Such ex-                   6 Music Notation Systems” [online], http:
periment would consist of two control groups of musical                    //musicnotation.org/tutorials/
untrained subjects learning piano, one using conventional                  intervals-in-6-6-music-notation-systems/
keyboard and notation, and the other using bigram nota-                    (Accessed: January 2015)
tion and binary keyboards. The subjects’ acquired musical
knowledge (in terms still to be defined) would be evaluated            [8] K. B. Schumann, ”Vorschläge zu einer gründlichen
over a broad enough period.                                                Reform in der Musik durch Einführung eines höchst
  Regarding the Bigram Editor, a number of improvements                    einfachen und naturgemässen Ton-und Noten-Systems,
might be implemented. One of the most relevant features                    nebst Beschreibung einer nach diesem System con-
would be the possibility of editing and exporting the score                struirten Tastatur für das Fortepiano”. 1859. http:
in a graphical format. That feature might allow to ob-                     //hdl.handle.net/1802/15314
tain high-quality scores in a printable version, for its usage         [9] A. Gould, ”Arrangement of keyboard for pi-
without the computer.                                                      anos” [patent], US Patent 24, 021. 1859. http:
  Another potential improvement might be the adoption of                   //www.google.com/patents/US24021
the MusicXML markup language [23] for the description                      (Accessed: January 2015).
files. MusicXML is used by most of the score editors and
Digital Audio Workstations; therefore, its adoption might             [10] B. Willemse, ”People and resources relat-
widen considerably the range of available compositions for                 ing to the Balanced keyboard” [online], 2013
the bigram notation, and the score exchange possibilities.                 http://balanced-keyboard.com/
                                                                           PeopleAndResources.aspx
                   6. REFERENCES                                      [11] A. Dolge, ”Pianos and Their Makers” [book], Cov-
[1] R. Parncutt, ”Systematic evaluation of the psycholog-                  ina/Dover. pp. 7883. ISBN 0-486-22856-8. 1911
    ical effectiveness of non-conventional notations and
                                                                      [12] K. K. Naragon, ”The Jankó Keyboard” [PhD Thesis],
    keyboard tablatures”. In Zannos, I. (Ed.), Music and
                                                                           M. West Virginia University. 1977
    Signs (pp. 146-174). Bratislava, Slovakia: ASCO Art
    & Science. 1999                                                   [13] tokyo yusyo inc, ”Chromatone” [online], 2014 http:
                                                                           //chromatone.jp (Accessed: January 2015).
[2] The Music Notation Project, ”Chromatic Staves Ex-
    ample” [online], http://musicnotation.org/                        [14] The Music Notation Project, ”Tri-Chromatic
    tutorials/chromatic-staves-example/                                    Keyboard       Layout” [online], http://
    (Accessed: January 2015)                                               musicnotation.org/wiki/instruments/
                                                                           tri-chromatic-keyboard-layout/        (Ac-
[3] T. S. Reed, ”Directory of music notation proposals”.
                                                                           cessed: January 2015).
    Notation Research Press. 1997
                                                 [15]                      crea-7, ”Dodeka” [online], http://www.dodeka.
[4] The Music Notation Project, ”Introducing The
                                                                           info (Accessed: January 2015).
    Music Notation Project” [online],      http:
    //musicnotation.org/blog/2008/01/            [16]                      The Music Notation Project, ”Software” [online],
    introducing-the-music-notation-project/                                http://musicnotation.org/software/
    (Accessed: January 2015)                                               #ftn1 (Accessed: January 2015)
                                                                 16
[17] MakeMusic, Inc., ”Finale” [online], 2015, http:
     //www.finalemusic.org (Accessed: January
     2015)
                                                                 17
      EXPRESSIVE QUANTIZATION OF COMPLEX RHYTHMIC
      STRUCTURES FOR AUTOMATIC MUSIC TRANSCRIPTION
                                                          Mauricio Rodriguez
                                                Superior Conservatory of Castile and Leon
                                                   [email protected]
                                                                   18
quantizers lacking the supervising of "logical/intel-              also be arbitrarily set by the user, allowing for different
ligent" algorithms or user input inspection, are evident           notational resolutions of the same input.
when, for instance, trying to quantize a simple rhythmic
pattern of a ritardando figure, whose notational result
would be most likely quantized with lots of tied
irregular tuplets, without this resulting quantization
necessarily showing the simple gesture-figure of a
deceleration.
   A general-purpose multi-nesting quantizer would not
necessarily overcome the limitations shared by previous
"non-logical" quantizers, however, it is claimed that a
refined level of musical expressivity is achieved when
the problem of quantization is generalized to capture
and render complex rhythmic structures such as those
present in multi-nested (fine-grained) rhythmic patterns.
                                                              19
                                                                 is equal to one second). From there, the comparison of
                                                                 the original time-input sequence with the time-
                                                                 converted "words" is straight. The next step is to do the
                                                                 proper rhythmic configuration groupings of the "word"
                                                                 that is chosen as optimal quantization. If for instance,
Figure 3. Multi-Level Quantization I.                            the original time input is 0.25, 0.58 and 0.17 seconds,
                                                                 the place-holder word of (1 (1 1 (2 (1 1 1))) would be
   In the following figure, the resolution subdivision is
                                                                 output as (1 (1 1 (2 (2.0 1))), being this result the best
downsampled by half (to 6) of the previous quantization
                                                                 quantization among the given words of that dictionary.
(Figure 4) :
                                                                    The idea of a user-predefined rhythmic dictionary
                                                                 might appear burdensome at first, but this quantizing
                                                                 model is essentially as effective as any other general
                                                                 purpose quantizer, with the invaluable advantage of
                                                                 rendering rhythmic results that fully conform to a
Figure 4. Multi-Level Quantization II.                           precise selection of rhythmic configurations that are
   Lastly, the same array of durations are quantized             previously input by its users, and therefore, the
                                                                 expressivity of the resulting transcriptions completely
with a different metric container (6/32, 1/4, and 7/32)
                                                                 accommodate to the idiomatic and aesthetic needs of
and the maximum number of subdivisions per nesting
level is 12 again (Figure 5) :                                   composers.
that the searching space to compare and get the optimal             When working with the look-up table quantizer, it is
error- difference is manually introduced by the user,            important to keep in mind that varied and fine-grained
instead of being algorithmically generated. Once the             quantizations can only take place if there is a
user includes a new "rhythmic word" in the dictionary,           comprehensively large data-set of place-holder
by using a symbolic "rhythmic-tree" representation, the          rhythmic words in the dictionary, otherwise one or
first task for the look-up algorithm is to convert any           several input values in some cases could not be
"word" into its equivalent timing equivalent                     quantized, in which case, the output of the algorithm
(e.g. (1 (1 1 (2 (1 1 1))) is equivalent to 0.25, 0.25,          will indicate the number of non-quantized events. An
0.5/3, 0.5/3 and 0.5/3 seconds, assuming a quarter-note          interesting compositional strategy to use this quantizer
                                                            20
can be forcing the quantization process to a limited set            profile" from a reference word being compared; for
of dictionary-words, and by gradually changing, or                  example, the rhythmic tree (1 (3 1 2)) could be output
rather expanding the searching space where                          as similar to (1 (4 1 3)) since both share the same
quantization takes place, different resolutions of the              "rhythmic profile", meaning that the first duration of the
quantization would show the kind of transcriptions that             group is larger than the second, and the second being
fit more naturally to the original input data. To facilitate        shorter than the third. The following figures show the
this compositional methodology, there is an additional              similarity rankings from the rhythmic tree (1 (2 1 1 4)),
routine in this quantizer to compare and sort the                   which is indeed a grouped version of the simple
deviation-error similarity (in an ascending to                      rhythmic tree (1 (1 1 1 1 1 1 1 1)); first, similarity is
descending order) of one chosen word in relation to all             presented regardless rhythmic profile (Figure 6), and
the other words of the dictionary. Additionally, the                then truncated to show words with equivalent profiles
similarity comparison among words can be truncated to               (Figure 7). The results of these comparisons are based
show only the ones that present the same "rhythmic                  on the following searching-space dictionary:
                                                               21
provide the resulting notation with musically consistent        [2] A. T. Cemgil, P. Desain, and B. Kappen, “Rhythmic
results, and on the other hand, quantization results                Quantization for Transcription”, Computer Music
should conform to the aesthetic and notational                      Journal, vol. 2, no. 24, 2000, pp. 60-76.
idiosyncrasy of a given user. Two general-purpose
quantizing models have been presented to aim for                [3] C. Agon, G. Assayag, J. Fineberg, and C. Rueda,
notational expressivity from different perspectives. A              “Kant: A critique of pure quantification”, in
multi-level or multi-nesting quantizer achieves                     Proceedings of the International Computer Music
'expression' by fine-grained / high-quality resolution              Conference, International Computer Music
output, while preserving an uncompromising general-                 Association, Aarhus Denmark, 1994, pp. 52–9.
purpose applicability (i.e. no pre/post filter processes
                                                                [4] M. Rodriguez, “Xa-lan: Algorithmic Generation of
are applied to input data). A look-up table quantizer
                                                                    Expressive Music Scores Based on Signal Analysis
guarantees expressivity through a user-defined data-set
                                                                    and Graphical Transformations” in Proceedings of
that works as a closed searching-space from where
                                                                    the     International  Workshop       on    Musical
quantization takes place. These two quantizers aim to
                                                                    Metacreation - 8th AAAI Conference on Artificial
be used as computing tools to facilitate and assist the
                                                                    Intelligence and Interactive Digital Entertainment,
composition and writing or notational rendering of
                                                                    Stanford University, 2012, pp. 83-85.
music works.
                                                                [5] C. Saap, “Rhythmic Quantizer”, unpublished paper,
                 5. REFERENCES                                      Center for Computer Assisted Research in the
[1] P. Desain, and H. Honing, “Quantization of musical              Humanities (CCARH), Stanford University, 2011.
    time: a connectionist approach” in Music and
    Connectionism, MIT Press Cambridge, 1991,                   [6] M. Laurson, and M. Kuuskankare, “PWGL: A
    pp. 150-167                                                    Novel Visual Language based on Common Lisp,
                                                                   CLOS, and OpenGL, in Proceedings of the
                                                                   International Computer Music Conference, San
                                                                   Francisco, 2002, pp. 142-145.
                                                           22
     COMPUTER-AIDED MELODY NOTE TRANSCRIPTION USING THE
          TONY SOFTWARE: ACCURACY AND EFFICIENCY
                                                                           23
                                                                           Sonic Visualiser [3] 3 and Praat [4] 4 , and in the case of
                                                                           DSP tools it is YIN [5]. None of the tools with user in-
                                                                           terfaces are specifically aimed at note and pitch transcrip-
 software                 URL                                              tion in music; some were originally aimed at the analy-
                                                                           sis of speech, e.g. Praat, others are generic music anno-
 Tony                     https://code.soundsoftware.
                                                                           tation tools, e.g. Sonic Visualiser and AudioSculpt [6]. In
                          ac.uk/projects/tony
                                                                           either case, the process of extracting note frequencies re-
 pYIN                     https://code.soundsoftware.
                                                                           mains laborious and can take many times the duration of
                          ac.uk/projects/pyin
                                                                           the recording. As a consequence, many researchers use
 Pitch Estimator          https://code.soundsoftware.
                                                                           a chain of multiple tools in custom setups in which some
                          ac.uk/projects/chp
                                                                           parts are automatic (e.g. using AMPACT alignment [7]), as
 Sonic Visualiser         https://code.soundsoftware.
                                                                           we have previously done ourselves [8]. Commercial tools
 Libraries                ac.uk/projects/sv
                                                                           such as Melodyne, 5 Songs2See 6 and Sing&See 7 serve
                                                                           similar but incompatible purposes. Melodyne in particular
                 Table 1. Software availability.                           offers a very sleek interface, but frequency estimation pro-
                                                                           cedures are not public (proprietary code), notes cannot be
                                                                           sonified, and clear-text export of note and pitch track data
                                                                           is not provided.
                                                                             In summary, the survey further corroborated the impres-
                                                                           sion gained during our own experiments on note intona-
                                                                           tion: a tool for efficient annotation of melodies is not avail-
 Field of work                        Position
                                                                           able, and the apparent interest in the scientific study of
                                                                           melody provides ample demand to create just such a tool.
 Music Inf./MIR         17 (55%)      Student              11 (35%)
 Musicology             4 (13%)       Faculty Member       10 (32%)        We therefore set out to create Tony, a tool that focusses
 Bioacoustics           3 (10%)       Post-doc             6 (19%)         on melodic annotation (as opposed to general audio anno-
 Speech Processing      2 (5%)        Industry             4 (13%)
                                                                           tation or polyphonic note annotation). The Tony tool is
 Experience
                                                                           aimed at providing the following components: (a) state-of-
   Pitch track          18∗ (58%)
   Note track           16∗ (52%)                                          the art algorithms for pitch and note estimation with high
   Both                 7 (23%)                                            frequency resolution, (b) graphical user interface with vi-
   None                 3 (10%)
∗ ) includes 7 who had experience with both pitch and note tracks.
                                                                           sual and auditory feedback for easy error-spotting, (c) in-
                                                                           telligent interactive interface for rapid correction of estima-
Table 2. Participants of the survey. Top four responses for                tion errors, (d) extensive export functions enabling further
participant makeup.                                                        processing in other applications. Lastly, the tool should be
                                                                           freely available to anyone in the research community, as it
                                                                           already is (see Table 1). This paper demonstrates that the
                                                                           remaining requirements have also been met.
                                                                             Any modern tool for melody annotation from audio re-
                                                                           quires signal processing tools for pitch (or fundamental
                                                                           frequency, F0) estimation and note transcription. We are
                                                                           concerned here with estimation from monophonic audio,
                                                                           not with the estimation of the predominant melody from
  The tools with graphical user interfaces mentioned by                    a polyphonic mixture (e.g. [9, 10]). Several solutions to
  survey participants were: Sonic Visualiser (12 partic-                   the problem of F0 estimation have been proposed, includ-
  ipants), Praat (11), Custom-built (3), Melodyne (3),                     ing mechanical contraptions dating back as far as the early
  Raven (and Canary) (3), Tony (3), WaveSurfer (3),                        20th century [11]. Recently, the area of speech process-
  Cubase (2), and the following mentioned once: Au-                        ing has generated several methods that have considerably
  dioSculpt, Adobe Audition, Audacity, Logic, Sound                        advanced the state of the art [4, 5, 12, 13]. Among these,
  Analysis Pro, Tartini and Transcribe!.                                   the YIN fundamental frequency estimator [5] has gained
  The DSP algorithms mentioned by survey participants                      popularity beyond the speech processing community, es-
  were: YIN (5 participants), Custom-built (3), Aubio                      pecially in the analysis of singing [14,15] (also, see survey
  (2), and all following ones mentioned once: AMPACT,                      above). Babacan et al. [16] provide an overview of the per-
  AMT, DESAM Toolbox, MELODIA, MIR Toolbox,                                formance of F0 trackers on singing, in which YIN is shown
  Tartini, TuneR, SampleSumo, silbido, STRAIGHT and                        to be state of the art, and particularly effective at fine pitch
  SWIPE.                                                                   recognition. More recently, our own pYIN pitch track es-
                                                                           timator has been shown to be robust against several kinds
                      Box 1. Survey Results.                                 3 http://www.sonicvisualiser.org/
                                                                             4 http://www.fon.hum.uva.nl/praat/
                                                                             5 http://www.celemony.com/
                                                                             6 http://www.songs2see.com/
                                                                             7 http://www.singandsee.com/
                                                                      24
of degradations [17] and to be one of the most accurate                τ = 0.1. The probability of unvoiced states is set to
pitch transcribers, especially for query-by-singing applica-           P (unvoiced|q) = (1 − v)/n, i.e. they sum to their com-
tions [18] (alongside the MELODIA pitch tracker [10]).                 bined likelihood of (1 − v) and v = 0.5 is the prior like-
  The transcription of melodic notes has received far less             lihood of a frame being voiced. The standard deviation σ
attention than pitch tracking—perhaps because polyphonic               varies depending on the state: attack states have a larger
note transcription [19, 20] was deemed the more exciting               standard deviation (σ = 5 semitones) than stable parts
research problem—but several noteworthy methods exist                  (σ = 0.9). This models that the beginnings of notes and
[2, 21, 22]. We have implemented our own note transcrip-               note transitions tend to vary more in pitch than the main,
tion method intended for use in Tony, of which a previous              stable parts of notes.
version has been available as part of the pYIN Vamp plu-                 The transition model imposes continuity and reasonable
gin [17]. This is the first time pYIN note transcription has           pitch transitions. Figure 1a shows a single note model,
been presented and evaluated in a scientific paper.                    with connections to other notes. Within a note we use a
                                                                       3-state left-to-right HMM consisting of Attack, Stable and
                                                                       Silent states. These states are characterised by high self-
                       3. METHOD
                                                                       transition probability (0.9, 0.99 and 0.9999 for the three
Tony implements several melody estimation methods:                     note states, respectively), to ensure continuity. Within a
fully automatic pitch estimation and note tracking based               note, the only possibility other than self-transition is to
on pYIN [17], and custom methods for interactive re-                   progress to the next state. The last note state the Silent
estimation. Tony resamples any input file to a rate of                 state, allows transitions to many different Attack states of
44.1 kHz (if necessary), and the signal processing meth-               other notes. Like the musicological model in Ryynänen
ods work on overlapping frames of 2048 samples (≈46 ms)                and Klapuri’s approach [21] we provide likelihoods for
with a hop size of 256 samples (≈6 ms).                                note transitions. Unlike their approach, we do not deal
                                                                       with notes quantised to the integer MIDI scale, and so we
3.1 Pitch Estimation                                                   decided to go for a simpler heuristic that would only take
                                                                       into account three factors: (1) a note’s pitch has to be either
We use the existing probabilistic YIN (pYIN) method [17]               the same as the preceding note or at least 2/3 semitones dif-
to extract a pitch track from monophonic audio recordings.             ferent; (2) small pitch changes are more likely than larger
The pYIN method is based on the YIN algorithm [5]. Con-                ones; (3) the maximum pitch difference between two con-
ventional YIN has a single threshold parameter and pro-                secutive notes is 13 semitones. A part of the transition
duces a single pitch estimate. The first stage of pYIN cal-            distribution to notes with nearby pitches is illustrated in
culates multiple pitch candidates with associated probabil-            Figure 1b.
ities based on a distribution over many threshold parame-
ter settings. In a second stage, these probabilities are used
                                                                       3.3 Note Post-processing
as observations in a hidden Markov model, which is then
Viterbi-decoded to produce an improved pitch track. This               We employ two post-processing steps.               The first,
pitch track is used in Tony, and is also the basis for the note        amplitude-based onset segmentation helps separate con-
detection algorithm described below.                                   secutive notes (syllables) of similar pitches as follows. We
                                                                       calculate the root mean square (RMS, i.e. average) ampli-
3.2 Note Transcription                                                 tude denoted by ai in every frame i. In order to estimate
                                                                       the amplitude rise around a particular frame i we calculate
The note transcription method takes as an input the pYIN               the ratio of the RMS values between the frames either side
pitch track and outputs discrete notes on a continuous pitch
scale, based on Viterbi-decoding of a second, independent                                                    ai+1
                                                                                                       r=                                       (2)
hidden Markov model (HMM). Unlike other similar mod-                                                         ai−1
els, ours does not quantise the pitches to semitones, but
                                                                       Given a sensitivity parameter s, any rise with 1/r < s is
instead allows a more fine-grained analysis. The HMM
                                                                       considered part of an onset, 8 and the frame i − 2 is set
models pitches from MIDI pitch 35 (B1, ≈61 Hz) to MIDI
                                                                       to unvoiced, thus creating a gap within any existing note.
pitch 85 (C]6, ≈ 1109 Hz) at 3 steps per semitone, result-
                                                                       If no note is present, nothing changes, i.e. no additional
ing in n = 207 distinct pitches. Following Ryynänen [21]
                                                                       notes are introduced in this onset detection step. The sec-
we represent each pitch by three states representing attack,
                                                                       ond post-processing step, minimum duration pruning, sim-
stable part and silence, respectively. The likelihood of a
                                                                       ply discards notes shorter than a threshold, usually chosen
non-silent state emitting a pitch track frame with pitch q is
                                                                       around 100 ms.
modelled as a Gaussian distribution centered at the note’s
pitch p with a standard deviation of σ semitones, i.e.
                                                                       3.4 Semi-automatic Pitch Track Re-estimation
                                              
                                1            τ                         In addition to fully manual editing of notes (Section 3.4.2),
              P (np |q) = v ·      [φp,σ (q)]             (1)
                                z                                      the user can also change the pitch track. However, since
                                                                       human beings do not directly perceive pitch tracks, Tony
where np is a state modelling the MIDI pitch p, z is
                                                                       offers pitch track candidates which users can choose from.
a normalising constant and the parameter 0 < τ < 1
controls how much the pitch estimate is trusted; we set                  8   The inverse 1/r is used in order for s to correspond to sensitivity.
                                                                  25
                                                                                                                                       ●
                                                                                               0.20
                                                                             probability
                                                                                                                               ●               ●
                                                                                               0.10
                                                                                                                          ●                        ●
● ●
● ●
                                                                                               0.00
                                                                             ●             ●          ●   ●                        ●       ●                    ●   ●   ●
−2 −1 0 1 2
(a) Excerpt of the pYIN note transition network. (b) Central part of the note transition probability function.
tool choice
overview
main pane
candidate
pitch track
                                                                                                                                                               notes (blue) and
                                                                                                                                                               pitch track
                                                                                                                                                               waveform
selection
strip
                                                                                                                                                               spectrogram
                                                                                                                                                               toggle
                                      |       {z      }    |       {z        }             |                   {z         }
show, play, gain/pan elements for...        audio              pitch track                                notes
                                                                   26
Two methods are available: multiple alternative pYIN                   into a single pane: pitch track, note track, spectrogram and
pitch tracks on a user-selected time interval, and a single            the waveform. Visibilty of all can be toggled. The focus on
pitch track on a user-selected time-pitch rectangle.                   single melodies meant that we could design a special note
                                                                       layer with non-overlapping notes. This averts possible an-
3.4.1 Multiple pYIN pitch tracks
                                                                       notation errors from overlapping pitches.
In order to extract multiple pitch tracks, the pYIN method               As soon as the user opens an audio file, melodic rep-
is modified such that its second stage runs multiple times             resentations of pitch track and notes are calculated using
with different frequency ranges emphasised. The intended               the methods described in Sections 3.1 and 3.2. This con-
use of this is to correct pitches over short time intervals. As        trasts with general tools like Praat, Sonic Visualiser or Au-
in the default version, the first pYIN stage extracts multiple         dioSculpt, which offer a range of processing options the
pitch candidates mi (given in floating point MIDI pitches)             user has to select from. This is avoided in Tony, since the
for every frame, with associated probabilities pi . Depend-            analysis objective is known in advance. However, the user
ing on the frequency range, these candidate probabilties               has some control over the analysis parameters via the menu
are now weighted by a Gaussian distribution centered at                and can re-run the analysis with the parameters changed.
cj = 48 + 3 × j, j = 1, . . . , 13, for the j th frequency               Editing pitch tracks and notes is organised separately.
range, i.e. the new candidate pitch probabilities are                  Note edits concern only the placement and duration of
                                                                       notes in time, and their pitch is calculated on the fly as
                  pij = pi × φcj ,σr (mi ),                (3)         the median of the underlying pitch track. Any corrections
                                                                       in the pitch dimension are carried out via the pitch track.
where φ(·) is the Gaussian probability density function and
                                                                         In order to select pitches or notes the user selects a time
σr = 8 is the pitch standard deviation, indicating the fre-
                                                                       interval, either via the Selection Strip or via keyboard com-
quency width of the range. With these modified pitch prob-
                                                                       mands. Both pitch track and note track can then be ma-
abilities, the Viterbi decoding is carried out as usual, lead-
                                                                       nipulated based on the selection. The most simple pitch
ing to a total of 13 pitch tracks.
                                                                       track actions are: choose higher/lower pitch (by octave) in
  Finally, duplicate pitch tracks among those from the 13
                                                                       the selected area; remove pitches in the selected area. For
ranges are eliminated. Two pitch tracks are classified as
                                                                       more sophisticated pitch correction, the user can request
duplicates if at least 80% of their pitches coincide. Among
                                                                       alternative pitch tracks in a selected time interval (see Sec-
each duplicate pair, the pitch track with the shorter time
                                                                       tion 3.4.1), or the single most likely pitch track in a time-
coverage is eliminated.
                                                                       pitch rectangle (see Section 3.4.2). Note actions are: Split,
3.4.2 Pitch track in time-pitch rectangle                              Merge, Delete, Create (including “form note from selec-
                                                                       tion”), and Move (boundary). The note pitch is always the
In some cases, the desired pitch track is not among those
                                                                       median of the pitch track estimates it covers and is updated
offered by the method described in Section 3.4.1. In such
                                                                       in real-time.
cases we use a YIN-independent method of finding pitches
based on a simple harmonic product spectrum [23]. When
                                                                       4.2 Sound Interface
using this method, the user provides the pitch and time
range (a rectangle), and for every frame the method re-                Tony provides auditory feedback by playing back the ex-
turns the pitch with the maximum harmonic product spec-                tracted pitch track as well as the note track alongside the
tral value (or no pitch, if the maximum occurs at the upper            original audio. Like the visual pitch track and note repre-
or lower boundary of the pitch range). This way even sub-              sentations, playback (including that of the original record-
tle pitches can be annotated provided that they are local              ing) can be toggled using dedicated buttons in a toolbar
maxima of the harmonic product spectrum.                               (see Figure 2), giving users the choice to listen to any com-
                                                                       bination of representations they wish.
                 4. USER INTERFACE                                       Sonification of the notes is realised as a wave table play-
                                                                       back of an electric piano sound. The sound was espe-
Figure 2 is a screenshot of the Tony user interface. The               cially synthesised for its neutral timbre and uniform evolu-
basic interface components as well as the underlying audio             tion. Unlike other programs, synthesis in Tony is not con-
engine and other core components are well tested as they               strained to integer MIDI notes, and can sonify subtle pitch
come from the mature code base of Sonic Visualiser (see                differences as often occur in real-world performances. The
also Table 1). Tony differs from the other tools in that it            pitch track is synthesised on the fly, using a sinusoidal ad-
is designed for musical note sequences, not general pitch              ditive synthesis of the first three harmonic partials.
events, and intentionally restricted to the annotation of sin-
gle melodies. This specialisation has informed many of our
design choices. Below we highlight several key aspects of                                  5. EVALUATION
the Tony interface.
                                                                       To assess the utility of Tony as a note transcription sys-
                                                                       tem, we conducted two experiments. First, we compared
4.1 Graphical Interface
                                                                       the underlying note transcription method to existing meth-
While graphical interface components from Sonic Visu-                  ods, using a publicly available dataset [24]. Second, in a
aliser have been re-used, the focus on a single task has al-           real-world task an expert annotated notes for an intonation
lowed us to combine all relevant visualisation components              study using the Tony software, and we measured the time
                                                                  27
            Group.1                   Overall. Acc.    Raw. Pitch. Acc.     Vo. False Alarm     Vo. Recall    F COnPOff      F COnP       F COn
       1    melotranscript                     0.80                0.87                0.37          0.97          0.45         0.57        0.63
       2    ryynanen                           0.72                0.76                0.37          0.94          0.30         0.47        0.64
       3    smstools                           0.80                0.88                0.41          0.99          0.39         0.55        0.66
       4    pYIN s=0.0, prn=0.00               0.83                0.91                0.37          0.98          0.38         0.56        0.61
       5    pYIN s=0.0, prn=0.07               0.84                0.91                0.34          0.98          0.40         0.59        0.64
       6    pYIN s=0.0, prn=0.10               0.84                0.91                0.33          0.97          0.41         0.60        0.64
       7    pYIN s=0.0, prn=0.15               0.84                0.90                0.32          0.96          0.41         0.60        0.63
       8    pYIN s=0.6, prn=0.00               0.84                0.91                0.35          0.98          0.38         0.56        0.61
       9    pYIN s=0.6, prn=0.07               0.84                0.91                0.32          0.97          0.43         0.62        0.67
      10    pYIN s=0.6, prn=0.10               0.85                0.91                0.31          0.97          0.44         0.62        0.67
      11    pYIN s=0.6, prn=0.15               0.85                0.90                0.29          0.95          0.44         0.62        0.65
      12    pYIN s=0.7, prn=0.00               0.83                0.90                0.33          0.97          0.39         0.54        0.61
      13    pYIN s=0.7, prn=0.07               0.85                0.91                0.30          0.97          0.46         0.63        0.69
      14    pYIN s=0.7, prn=0.10               0.85                0.90                0.29          0.96          0.47         0.64        0.69
      15    pYIN s=0.7, prn=0.15               0.85                0.89                0.27          0.94          0.47         0.64        0.67
      16    pYIN s=0.8, prn=0.00               0.84                0.89                0.28          0.96          0.39         0.52        0.61
      17    pYIN s=0.8, prn=0.07               0.85                0.89                0.25          0.95          0.48         0.66        0.73
      18    pYIN s=0.8, prn=0.10               0.85                0.89                0.24          0.94          0.49         0.68        0.73
      19    pYIN s=0.8, prn=0.15               0.85                0.87                0.22          0.91          0.50         0.67        0.71
taken and the number of notes manipulated. The experi-                          serve that—without post-processing—the pYIN note tran-
mental results are given below.                                                 scription achieves values slightly worse than the best-
                                                                                performing algorithm (melotranscript). Considering the
5.1 Accuracy of Automatic Transcription                                         post-processed versions of pYIN, minimum duration prun-
                                                                                ing alone does not lead to substantial improvements. How-
We used a test set of 38 pieces of solo vocal music (11                         ever, a combination of onset detection and minimum du-
adult females, 13 adult males and 14 children) as col-                          ration pruning leads to COnPOff F values of up to 0.50,
lected and annotated in a previous study [24]. All files                        compared to 0.38 for the baseline pYIN and 0.45 for the
are sampled at 44.1 kHz. We also obtained note transcrip-                       best other algorithm (melotranscript). This carries through
tion results extracted by three other methods: Melotran-                        to the more relaxed evaluation measures, where F values
script [22], Gómez and Bonada [2], Ryynänen [21]. We                          of the post-processed versions with at least 0.10 seconds
ran 16 different versions of Tony’s note transcription algo-                    pruning are always higher than the baseline pYIN algo-
rithm, a grid search of 4 parameter settings for each of the                    rithm and the other algorithms tested. Figure 3 shows all
two post-processing methods. Minimum duration pruning                           100 ms-pruned pYIN results against other algorithms.
was parametrised to 0 ms (no pruning), 70 ms, 100 ms and
150 ms. The amplitude-based onset segmentation parame-
                                                                                5.2 Effort of Manual Note Correction
ter was varied as s = 0, 0.6, 0.7 and 0.8.
  For frame-wise evaluation we used metrics from the eval-                      In order to examine the usability of Tony we measured
uation of pitch tracks [25] as implemented in mir eval                          how editing affects the time taken to annotate tunes. We
[26], but applied them to notes by assigning to every frame                     used recordings of amateur singing created for a different
the pitch of the note it is covered by. The results are listed                  project, and one of us (JD) annotated them such that each
in Table 3. The pYIN note transcriptions reach very high                        final note annotation corresponded exactly to one ground
overall accuracy rates (0.83–0.85) throughout. The high-                        truth note in the musical score matching her perception of
est score of the other methods tested is 0.80. 9 Among the                      the notes the singer was actually performing. The dataset
pYIN versions tested, the best outcome was achieved by                          consists of 96 recordings, with 32 singers performing three
combining pruning of at least 100 ms and an onset sensi-                        tunes from the musical The Sound of Music. The annota-
tivity parameter of at least s = 0.6. The efficacy of the                       tion was performed with an earlier version of Tony (0.6).
system results from high raw pitch accuracy (correct when                         Tony offers five basic editing operations: Create, Delete,
there is a pitch), and low rate of voicing false alarm. There                   Split, Join, and Move (either left or right note boundary).
is, however, a tradeoff between the two: better raw pitch                       We estimated the number of edits required, considering
accuracy is achieved with low values of s, and lower false                      only timing adjustments (i.e. ignoring any changes to the
alarm rates with higher values of s. The algorithm sm-                          pitch of a note). 10 The estimate is a custom edit distance
stools achieves perfect voicing recall at the price of having                   implementation. First, we jointly represent the actual state
the highest voicing false alarm rate.                                           of the note track (after automatic extraction) and the de-
  The results for note-based evaluation expose more sub-                        sired state of the note track as a string of tokens. Secondly,
tle differences. The metric “COnPOff” [24], which                               we define transformation rules that correspond to the five
takes into account correct note onset time (±5 ms), pitch                       possible edit operations. The estimate of the number of
(±0.5 semitones) and offset (± 20% of ground truth note                         edits performed by the user is then an automated calcula-
duration), is the most demanding metric; “COnP” (cor-                           tion of a series of reductions to the source string in order to
rect onset and pitch) and “COn” (correct onset) are re-                         arrive at the target. In particular, if pYIN happened to per-
laxed metrics. Here, we report F measures only. We ob-                          form a completely correct segmentation “out of the box”,
   9 Note that Ryynanen’s method outputs only integer MIDI notes, so              10 At the time of the experiment we were not able to record the actual
for the fine-grained analysis required here it may be at a disadvantage.        actions taken.
                                                                           28
                     1.0                                                                                                                                                                                                                                                                                                                 0.7                                                                                                                                 0.8
                                                                                                                                                             0.95
                                                                                                                                                                                                                                                                                                                                         0.6                                                                                                                                                                                                                                                              0.8
                                                                                                                                                             0.90
                                                                                                                                                                                                                                                                                                                                                                                                                                                        COnPOff, F measure
Raw Pitch Accuracy
                     0.9
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        COnP F, measure
                                                                                                                                          Overall Accuracy
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                ●
                                                                                                                                                                                                                                                                                                                                                                                                                                        ●
                                                                                                                                                             0.85                                                                                                                                                                        0.5
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          0.6
                     0.8                                                                                                                                     0.80                                                                                                                                                                        0.4                                                                                                                                 0.4
                                                                                                                                                             0.75                                                                                                                                                                        0.3                                                                                                                                                                                                                                                              0.4
                     0.7
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●
                                                                                                                       ●
                                                                                                                                                             0.70                                                                                                       ●                                                                0.2                                                                                                                                 0.2                                ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●
                                                                                                                                                                                                                                      ●                 ●                                                                                                                                                                                                                                                                                                                                                                                                              ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●
                     0.6                                                                                                                                     0.65                                                                                                                                                                                                                                                                                                                                                                         ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          0.2
                                                                                                                                                                                                                                                                                                                                         0.1                                                                                                                                                                                              ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●
                                                                                                                                                             0.60                                                                                                                                                                                                                                                                                                            0.0                                            ●                                                                                                                            ●
                               melotranscript
                                                 ryynanen
                                                            smstools
                                                                       pYIN, s = 0
                                                                                     pYIN, s = 0.6
                                                                                                     pYIN, s = 0.7
                                                                                                                     pYIN, s = 0.8
                                                                                                                                                                                           melotranscript
                                                                                                                                                                                                            ryynanen
                                                                                                                                                                                                                         smstools
                                                                                                                                                                                                                                    pYIN, s = 0
                                                                                                                                                                                                                                                      pYIN, s = 0.6
                                                                                                                                                                                                                                                                      pYIN, s = 0.7
                                                                                                                                                                                                                                                                                      pYIN, s = 0.8
                                                                                                                                                                                                                                                                                                                                                 melotranscript
                                                                                                                                                                                                                                                                                                                                                                  ryynanen
                                                                                                                                                                                                                                                                                                                                                                             smstools
                                                                                                                                                                                                                                                                                                                                                                                        pYIN, s = 0
                                                                                                                                                                                                                                                                                                                                                                                                      pYIN, s = 0.6
                                                                                                                                                                                                                                                                                                                                                                                                                      pYIN, s = 0.7
                                                                                                                                                                                                                                                                                                                                                                                                                                      pYIN, s = 0.8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   melotranscript
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ryynanen
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               smstools
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          pYIN, s = 0
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        pYIN, s = 0.6
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        pYIN, s = 0.7
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        pYIN, s = 0.8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                melotranscript
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 ryynanen
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            smstools
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       pYIN, s = 0
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     pYIN, s = 0.6
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     pYIN, s = 0.7
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     pYIN, s = 0.8
Figure 3. Results of existing algorithms and pYIN note transcription with minimum duration pruning at 0.1 s, showing,
from left to right, raw pitch accuracy, overall accuracy, voicing false alarm, COnPOff F measure and COnP F measure.
                                                                                                                                                                                                                                ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Est. (seconds)                                                                  Std. Error                                          p value
                                                                                                                                                                                                                                                        ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                      (Intercept)                                                                       437.20                                                             51.87                            <0.01
  20
20
15
                                                                                                                                                                                                                                ●
                                                                                                                                                                                                                                            ●
                                                                                                                                                                                                                                          ●● ●
                                                                                                                                                                                                                                                                                                                                ●● ●
                                                                                                                                                                                                                                                                                                                                    ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                            Joins                                                                         3.18                                                              2.35                             0.18
                                                                                                                                                                                                                            ●     ● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                           Move                                                                          45.51                                                             39.61                             0.25
  10
10
                                                                                                                                                                                                                             ●    ●         ●                                                                                                     ●
                                                     ●                                                   ●
                                                                                                                                                                                                                         ●●
                                                                                                                                                                                                                          ●
                                                                                                                                                                                                                                  ●● ●● ●    ●                                              ●
                                                                                                                                                                                                                                                                                                                                        ●                                                                                                             Familiarity                                                                        -2.31                                                              0.82                             0.01
                                                                                                                                                                                                                            ●●●    ●  ●   ●                                                               ●
                                                                                                                                                                                                                       ●●●●●●●●● ●●● ● ● ●●                                                           ●                         ●
                                                                                                                                                                                                                         ●●●●●● ●●●●    ●                                                      ●
  5
                           ●                                                                                                          ●
  0
(a) Number of edits per recording.                                                                                                                                                                     (b) Time taken against edits.                                                                                                                                                              between the remaining three edit operations is more help-
Box plots with mean values (dots).                                                                                                                                                                                                                                                                                                                                                                ful: each Delete and Join accounts for 3.5 seconds time
                                                                                                                                                                                                                                                                                                                                                                                                  added, but splits take much longer: 5.7 seconds. This is
                                                                                     Figure 4. Edit operations.                                                                                                                                                                                                                                                                                   likely to result from the fact that the user has to position
                                                                                                                                                                                                                                                                                                                                                                                                  the play head or mouse pointer precisely at the split po-
                                                                                                                                                                                                                                                                                                                                                                                                  sition, whereas joins and deletes require far less precise
the edit count would be zero.                                                                                                                                                                                                                                                                                                                                                                     mouse actions. As Table 4 shows, most of the effects are at
  Figure 4a illustrates the distributions of edit counts in a                                                                                                                                                                                                                                                                                                                                     least moderately significant (p < 0.1), with the exception
box plot with added indicators of the mean. First of all, we                                                                                                                                                                                                                                                                                                                                      of number of Joins. The variance explained is R2 = 25%.
notice that very few notes had to be Created (mean of 0.17
per recording) or Moved (0.28), and that Join (8.64) and
Delete (8.82) are by far the most frequent edit operations,
followed by Splits (4.73). As expected, the total number of                                                                                                                                                                                                                                                                                                                                                                                                                                               6. DISCUSSION
edits correlates with the time taken to annotate the record-
ings (see Figure 4b).                                                                                                                                                                                                                                                                                                                                                                             The results of the second experiment may well have impact
  Which other factors influence the annotation time taken?                                                                                                                                                                                                                                                                                                                                        on the design of future automatic melody transcription sys-
We use multivariate linear regression on the number of                                                                                                                                                                                                                                                                                                                                            tems. They confirm the intuition that some edit actions take
Creates, Deletes, Splits, Joins, Moves and familiarity with                                                                                                                                                                                                                                                                                                                                       substantially more time for a human annotator to execute.
the Tony software (covariates), predicting the annotation                                                                                                                                                                                                                                                                                                                                         For example, the fact that Merges are much cheaper than
time (response). As expected, the results in Table 4 show                                                                                                                                                                                                                                                                                                                                         Splits suggests that high onset recall is more important than
that any type of editing increases annotation time, and that                                                                                                                                                                                                                                                                                                                                      high onset precision.
familiarity reduces annotation time. The baseline annota-                                                                                                                                                                                                                                                                                                                                           We would also like to mention that we are aware that the
tion time is 437 seconds, more than 7 minutes. (The mean                                                                                                                                                                                                                                                                                                                                          accuracy of automatic transcription heavily depends on the
duration of the pieces is 179 seconds, just under 3 min-                                                                                                                                                                                                                                                                                                                                          material. The tools we evaluated (including existing al-
utes.) The result on Familiarity suggests that every day                                                                                                                                                                                                                                                                                                                                          gorithms) were well-suited for the database of singing we
spent working with Tony reduces the time needed for an-                                                                                                                                                                                                                                                                                                                                           used; in other annotation experiments [27] it has become
notation by 2.3 seconds. 11 The time taken for every Cre-                                                                                                                                                                                                                                                                                                                                         obvious that some instruments are more difficult to pitch-
ate action is 145 seconds, a huge amount of time, which                                                                                                                                                                                                                                                                                                                                           track. Furthermore, it is useful to bear in mind that the
can only be explained by the fact that this operation was                                                                                                                                                                                                                                                                                                                                         dataset we used is predominantly voiced, so the voicing
very rare and only used on tracks that were very difficult                                                                                                                                                                                                                                                                                                                                        false alarm outcomes may change on different data.
anyway. Similar reasoning applies to the (boundary) Move
                                                                                                                                                                                                                                                                                                                                                                                                   As evident from our survey (Box 1), early versions of
operations, though the p value suggests that the estimate
                                                                                                                                                                                                                                                                                                                                                                                                  Tony have already been used by the community. This in-
cannot be made with much confidence. The distinction
                                                                                                                                                                                                                                                                                                                                                                                                  cludes our own use to create the MedleyDB resource [27],
  11 This is clearly only true within a finite study, since the reduction                                                                                                                                                                                                                                                                                                                         and some as yet unpublished internal singing intonation
cannot continue forever. Annotations happened on 14 different days.                                                                                                                                                                                                                                                                                                                               and violin vibrato experiments.
                                                                                                                                                                                                                                                                                                                                                                         29
                   7. CONCLUSIONS                                       [5] A. de Cheveigné and H. Kawahara, “YIN, a funda-
                                                                            mental frequency estimator for speech and music,” The
In this paper we have presented our new melody annota-
                                                                            Journal of the Acoustical Society of America, vol. 111,
tion software Tony, and its evaluation with respect to two
                                                                            no. 4, pp. 1917–1930, 2002.
aspects: firstly, an evaluation of the built-in note transcrip-
tion system, and secondly a study on how manual edits and               [6] N. Bogaards, A. Röbel, and X. Rodet, “Sound analysis
familiarity with the software influence annotation time.                    and processing with AudioSculpt 2,” in Proceedings of
  The note transcription results suggest that the pYIN note                 the International Computer Music Conference (ICMC
transcription method employed in Tony is state-of-the-art,                  2004), 2004.
in terms of frame-wise accuracy and note-based evalua-
tion. The study of manual edits shows the relative effort               [7] J. Devaney, M. Mandel, and I. Fujinaga, “A study of in-
involved in different actions, revealing that Splits and Cre-               tonation in three-part singing using the automatic mu-
ates are particularly expensive edits. This suggests that for               sic performance analysis and comparison toolkit (AM-
the task of note annotation, transcription systems should                   PACT),” in 13th International Society of Music Infor-
focus on voicing recall and note onset/offset accuracy.                     mation Retrieval Conference, 2012, pp. 511–516.
  In summary, we have presented a state-of-the-art note
annotation system that provides researchers interested in               [8] M. Mauch, K. Frieler, and S. Dixon, “Intonation in un-
melody with an efficient way of annotating their record-                    accompanied singing: Accuracy, drift, and a model of
ings. We hope that in the long run, this will create a surge                reference pitch memory,” Journal of the Acoustical So-
in research and hence understanding of melody and into-                     ciety of America, vol. 136, no. 1, pp. 401–411, 2014.
nation, especially in singing.
                                                                        [9] M. Goto, “A real-time music-scene-description system:
7.1 Acknowledgements                                                        Predominant-F0 estimation for detecting melody and
                                                                            bass lines in real-world audio signals,” Speech Com-
Thanks to Emilio Molina for kindly sharing the results of
                                                                            munication, vol. 43, no. 4, pp. 311–329, 2004.
the experiments in his recent evaluation paper [24], and to
all survey participants. Matthias Mauch is supported by a              [10] J. Salamon and E. Gòmez, “Melody Extraction from
Royal Academy of Engineering Research Fellowship.                           Polyphonic Music Signals using Pitch Contour Char-
                                                                            acteristics,” IEEE Transactions on Audio, Speech, and
               A. SURVEY QUESTIONS                                          Language Processing, vol. 20, no. 6, pp. 1759–1770,
                                                                            2010.
    • Have you ever used software to annotate pitch in au-
      dio recordings? (multiple choice)                                [11] C. E. Seashore, “The tonoscope,” The Psychological
    • What software tools/solutions for pitch annotation                    Monographs, vol. 16, no. 3, pp. 1–12, 1914.
      exist? List tools that you are aware of. (free text)
    • What characteristics of the tools would need to be               [12] D. Talkin, “A robust algorithm for pitch tracking,” in
      improved to better suit your use case? (free text)                    Speech Coding and Synthesis, 1995, pp. 495–518.
    • Comments (free text)
                                                                       [13] H. Kawahara, J. Estill, and O. Fujimura, “Aperiodic-
    • Your field of work (multiple choice)
                                                                            ity extraction and control using mixed mode excita-
                                                                            tion and group delay manipulation for a high qual-
                    B. REFERENCES                                           ity speech analysis, modification and synthesis sys-
                                                                            tem STRAIGHT,” Proceedings of MAVEBA, pp. 59–
[1] S. Pant, V. Rao, and P. Rao, “A melody detection user
                                                                            64, 2001.
    interface for polyphonic music,” in National Confer-
    ence on Communications (NCC 2010), 2010, pp. 1–5.                  [14] J. Devaney and D. Ellis, “Improving MIDI-audio align-
[2] E. Gómez and J. Bonada, “Towards computer-assisted                     ment with audio features,” in 2009 IEEE Workshop on
    flamenco transcription: An experimental comparison                      Applications of Signal Processing to Audio and Acous-
    of automatic transcription algorithms as applied to a                   tics, 2009, pp. 18–21.
    cappella singing,” Computer Music Journal, vol. 37,
                                                                       [15] M. P. Ryynänen, “Probabilistic modelling of note
    no. 2, pp. 73–90, 2013.
                                                                            events in the transcription of monophonic melodies,”
[3] C. Cannam, C. Landone, M. B. Sandler, and J. P. Bello,                  Master’s thesis, Tampere University of Technology,
    “The Sonic Visualiser: A visualisation platform for se-                 2004.
    mantic descriptors from musical signals.” in Proceed-
    ings of the 7th International Conference on Music In-              [16] O. Babacan, T. Drugman, N. D’Alessandro, N. Hen-
    formation Retrieval (ISMIR 2006), 2006, pp. 324–327.                    rich, and T. Dutoit, “A comparative study of pitch
                                                                            extraction algorithms on a large variety of singing
[4] P. Boersma, “Praat, a system for doing phonetics by                     sounds,” in Proceedings of the 38th International Con-
    computer,” Glot International, vol. 5, no. 9/10, pp.                    ference on Acoustics, Speech, and Signal Processing
    341–345, 2001.                                                          (ICASSP 2013), 2013, pp. 7815–7819.
                                                                  30
[17] M. Mauch and S. Dixon, “pYIN: a fundamental fre-
     quency estimator using probabilistic threshold distribu-
     tions,” in Proceedings of the IEEE International Con-
     ference on Acoustics, Speech and Signal Processing
     (ICASSP 2014), 2014, pp. 659–663.
[18] E. Molina, L. J. Tardón, I. Barbancho, and A. M.
     Barbancho, “The importance of F0 tracking in query-
     by-singing-humming,” in 15th International Society
     for Music Information Retrieval Conference (ISMIR
     2014), 2014, pp. 277–282.
[19] E. Benetos, S. Dixon, D. Giannoulis, H. Kirchhoff,
     and A. Klapuri, “Automatic music transcription: Chal-
     lenges and future directions,” Journal of Intelligent In-
     formation Systems, vol. 41, no. 3, pp. 407–434, 2013.
[20] T. Cheng, S. Dixon, and M. Mauch, “A deterministic
     annealing EM algorithm for automatic music transcrip-
     tion,” in 14th International Conference of the Society of
     Music Information Retrieval (ISMIR 2013), 2013, pp.
     475–480.
[21] M. P. Ryynänen and A. P. Klapuri, “Automatic tran-
     scription of melody, bass line, and chords in poly-
     phonic music,” Computer Music Journal, vol. 32, no. 3,
     pp. 72–86, 2008.
[22] T. De Mulder, J. Martens, M. Lesaffre, M. Leman,
     B. De Baets, and H. De Meyer, “Recent improvements
     of an auditory model based front-end for the transcrip-
     tion of vocal queries,” in Proceedings of the IEEE In-
     ternational Conference on Acoustics, Speech, and Sig-
     nal Processing (ICASSP 2004), vol. 4, 2004, pp. iv–
     257–iv–260.
[23] M. R. Schroeder, “Period histogram and product spec-
     trum: New methods for fundamental-frequency mea-
     surement,” The Journal of the Acoustical Society of
     America, vol. 43, no. 4, pp. 829–834, 1968.
[24] E. Molina, A. M. Barbancho, L. J. Tardón, and I. Bar-
     bancho, “Evaluation framework for automatic singing
     transcription,” in Proceedings of the 15th International
     Society for Music Information Retrieval Conference
     (ISMIR 2014), 2014, pp. 567–572.
[25] J. Salamon, E. Gómez, D. P. W. Ellis, and G. Richard,
     “Melody extraction from polyphonic music signals:
     Approaches, applications, and challenges,” IEEE Sig-
     nal Processing Magazine, vol. 31, no. 2, pp. 118–134,
     2014.
[26] C. Raffel, B. McFee, E. J. Humphrey, J. Salamon,
     O. Nieto, D. Liang, and D. P. W. Ellis, “mir eval: A
     transparent implementation of common MIR metrics,”
     in 15th International Society for Music Information Re-
     trieval Conference (ISMIR 2014), 2014, pp. 367–372.
[27] R. Bittner, J. Salamon, M. Tierney, M. Mauch, C. Can-
     nam, and J. Bello, “MedleyDB: a multitrack dataset for
     annotation-intensive MIR research,” in 15th Interna-
     tional Society for Music Information Retrieval Confer-
     ence (ISMIR 2014), 2014, pp. 155–160.
                                                                 31
                             UNDERSTANDING ANIMATED NOTATION
                                                                          32
staff notation to communicate their musical ideas [9].                  Since the 1970s very different connections of sound or
Some composers even experimented with video scores                   music and visuals came to life. Visual music, VJing and
[5]. The diversity of appearances and the desire to                  especially music video shaped our everyday culture like
overcome restrictions is common for avant-garde graphic              film, art, advertisements and of course music itself [10].
notation and animated notation today.                                Technological progress, manifesting for instance in
                                                                     ubiquitous computer power, had a major impact on music
                                                                     production, performance and consumption [4]. Regular
                                                                     staff notation on the other side underwent only minor
                                                                     changes in the last 50 years, while its core system,
                                                                     meaning how music is principally notated, remained the
                                                                     same. Surely influences of the developments of the avant-
                                                                     garde can be traced in today’s notation practice. Very
                                                                     often staff notation is extended by individual signs and
                                                                     symbols to indicate sounds or techniques that are
                                                                     otherwise not communicable. In 2013 Christian Dimpker
                                                                     published his book Extended Notation that develops a
                                                                     consistent notation system for extended instrumental
                                                                     playing techniques and electro acoustic music, based on
                                                                     the common practice [6]. Generally staff notation remains
                                                                     surely satisfyingly expressive. However, compared to the
                                                                     influence of the computer on music itself, music notation
                                                                     (apart from notation software like Sibelius or Finale)
Figure 1. Musical Graphic, December 1952 by Earle Brown [21].        seems to be almost unaltered by technological progress.
   From the 1970s onwards, composers seem to lose                    Only in the recent years, with concepts of
interest in the graphic notation. According to Julia H.              interdisciplinarity, inter-media and hybrid arts, a growing
Schröder visual artists developed ideas further as “their            interest in alternative notation utilizing computational
interest in the individual handwriting manifesting itself in         power can be found. Practice shows there are multiple
musical graphics is greater than that of composers, who              areas of application that feature new ways of music
were concerned with the establishment of a new,                      making and composition. Animated notation is just one
normative graphic canon“ [5]. Schröders analysis reveals             amongst many. Yet, the utilization of screens and
two important distinctions regarding graphic and                     animation techniques for notational purposes is in its
animated notation. First, avant-garde composers wanted               early stages. Even a commonly used term for this kind of
to develop a generally applicable kind of graphic                    notations can hardly be found. Australian composer and
notation, implying a certain framework and rules to be               researcher Lindsey Vickery generally calls them screen
able to work with it like with staff notation. As this did           scores [20] while Severin Behnen talks in his PhD thesis
not work out, they lost interest. Second, avant-garde                about motion graphic scores with its subdivisions
composers' self-conception and position within music                 animated, interactive and plastic score [1]. An online
history regarding the development of a new notation was              collection of several works by composer Pall Ivan
entirely different from the situation of animated notation           Palsson [24] or the website animatednotation.com
today. In “Darmstädter Beiträge zur neuen Musik –                    by Ryan Ross Smith [26] display a wide range of
Notation” [19] composers like Brown, Kagel and                       different scores and approaches. Thereby animated scores
Haubenstock Ramati wrote about their practices using                 use various techniques and styles and are created with
graphic notation. For them it was clear and self-evident             various software. In animated notation, graphical
that the composition of new music required a new music               attributes are not strictly mapped with specific sounds or
notation. Furthermore this new notation could only come              actions. There are no symbols or a syntax. Although
to life by somehow overcoming regular staff                          animated scores often share common features, for
notation [19]. Today, animated notation can be                       instance a ‘play-head’ that indicates the actual position
considered a tool. It extends possibilities of notating              within a scrolling score [20], none of these features are
contemporary music without neglecting other techniques               obligatory or generally valid. Basically each score looks
or abandoning staff notation. Thereby animated notation              different. On one hand this seems to be a deficiency. On
or people using it respectively, are not aiming to establish         the other hand this freedom is the bases for individual
a rigid framework and generally applicable rules.                    artistic and musical expression and the possibility to
                                                                     create new music [9, 19], just like in the 1960s.
                                                                33
                    SPECIAL FEATURES                                        parameters in a customized manner, animated notation
                                                                            can create a common ground, a kind of musical
1    Two Areas of Application                                               communication platform for all instruments involved [7].
Let’s take a look at two areas of actual application to                     Furthermore music like live electronic music is often
show two major features of animated notation. The area                      improvised. Apart from offering a score that is able to
where animated notation can demonstrate its intuitive                       generally structure and define musical improvisation,
applicability the best is education. Dabbledoo Music for                    animated notation manifests usually in a video (file) and
instance is a project by Shane Mc Kenna and Kilian                          is therefore time-based media [2]. This allows especially
Redmond from Ireland [22]. They call it “a new                              to structure events accurately over time, and the score is
multimedia experience for young musicians… It aims to                       as long as the piece. Hence, frequently used techniques
encourage creative music making and group performance                       like score following, stop watches or other means of
in a new and exciting way.” [22] Various types of                           triggering musical events and synchronizing acoustic and
animated notation, varying from simple to complex ones,                     computer instruments, with their known drawbacks
are used to encourage and educate children to improvise                     become obsolete.
and compose within a structured framework. Thereby
especially timing and interaction can be practiced without                  2   Tackling a Typology
the necessity of learning a complicated notational system.                  After examining the development of contemporary
Another interesting example is the artistic research                        scores, composer and researcher Lindsay Vickery
project Voices of Umeå at Umeå University Sweden by                         suggested four different types of what he calls screen
Anders Lind. He utilizes The Max Maestro, a standalone                      scores.     Namely      scrolling      score,   permutation,
application programmed in Max/MSP that features an                          transformative and generative scores [20]. Vickery’s
animated notation which can be controlled in real-time                      terminology was introduced in an historical context.
[23]. A choir of musically untrained people is conducted                    Furthermore his subdivisions describe mainly the visual
via The Max Maestro to produce vowels and other                             appearance of animated scores, like scrolling score, as the
sounds. The length of each vowel, dynamics and structure                    score actually scrolls. Additionally, in practice many
over time are indicated. It basically allows participants to                scores mix techniques. They might not be described
perform prima vista. Thereby performers become a part                       accurately by one of the four different types. Therefore
of the real-time compositional process [23]. Again the                      this rather strict distinction is not truly useful for a
intuitiveness and simplicity of the animated score, in                      categorization of animated scores. Still the used
relation to the high quality of the musical performance, is                 terminology proves very useful when discussing the
remarkable.                                                                 appearance of animated notation in general. As
                                                                            mentioned earlier, the generative type is neglected in this
                                                                            paper.
                                                                               A frequently used type of animated score is the
                                                                            scrolling score [20] (e.g. see figure 4). These kind of
                                                                            scores have several advantages. They support western
                                                                            reading habits as they scroll usually from left to right.
                                                                            These scores often work with a play zone or another
                                                                            indication that signals the performer which part to play.
                                                                            Many use a so called play-head, which is usually a line
                                                                            that graphics have to cross to indicate when to play them
                                                                            (see fig. 4). However, the most important feature of a
Figure 2. Screenshot of Dabbledoo Music website (beta version) [22].        scrolling score is the possibility to read ahead. Performers
   A second area of application are musical genres or                       are of course used to this from staff notation. A lack of
works that utilize alternative instruments, a mix of                        this feature might therefore cause considerable problems
various instruments (like in live electronic music with                     for musicians to utilize an animated notation [7].
acoustic and computer instrument) or are composed for                       Scrolling scores often utilize preliminary knowledge of
indeterminate instrumentation. As there is no common                        the performers, for instance that a relative pitch height is
practice, the notation of alternative instruments or objects                indicated on the vertical axes.
can be accomplished on a very individual bases by the                          Second, there are permutation or coherent scores, like
composer. For instance abstract computer sounds cannot                      for instance some Ryan Ross Smith’s research studies
be adequately represented in regular staff notation. By                     [26]. These scores usually focus on the sequence of sound
using abstract graphics, which can be mapped to musical                     events and are therefore actional. Those scores appear as
                                                                       34
circular shapes, like clocks, grids or other networks of
(sometimes multilayered) objects, that change
sequentially (permutation) over time and indicate
precisely when and sometimes even how long to play.
Often also the number of players is clearly indicated.
Depending on the graphic design of the score, it is
possible for the performer to read ahead (see fig. 2).
Generally, these scores convey structure of events over
time and not specific sounds. This allows them to be very
precise regarding the sequence of events. If the sequences
are not too fast, these scores could be even played prima
vista by experienced musicians. Of course there are also
other permutation scores where performers have clear
instructions not only when to play, but also what to play.
Then these scores can be regarded as the most accurate
type of animated notations, where the least
interpretational effort and least amount of improvisation
for the performer is required.                                     Figure 3. 3D-coordinate system to categorize animated scores. Example
   Finally, there are transformative or morphing scores.           scores “SYN-Phon” and “Study No.31”
They are usually highly associative in character. Graphics            x-axes (red) : associative - instructive. This distinction
move on the screen or change their overall appearance              refers to the appearance and possible interpretation of an
from one distinct graphical object to another (e.g.                animated score. A purely associative score can be
morphing). Movements in any direction along X, Y and Z             regarded as a sheer trigger for improvisation, similar to a
axes are possible. This does not allow performers to look          musical graphic. This means musical or acoustic
ahead. Therefore these scores require profound                     parameters are not clearly mapped to graphical ones by
involvement by the performer. Without further                      the composer. What color, size or motion of a graphic
instructions or guidelines by the composer, these scores           indicate, is not defined. Rather the overall look and
are musical graphics in motion in the sense of December            appearance of the score should influence the
1952. Nevertheless it is possible to connect visual and            improvisation of the performer. An instructive score on
musical attributes. For instance the overall appearance,           the other hand indicates what to do and often precisely,
the design of graphics, color, shape and of course the             when to do it. The score communicates instructions. The
speed of the score can be mapped by the composer to                clock-like score on the Dabbledoo website (fig. 2) is a
convey specific sonic attributes.                                  rather instructive score. The clock hand indicates when to
   When analyzing contemporary animated notations,                 play, and the color indicates the instrument group (red or
various mixed types of the above mentioned appearances             blue) or a pause (white).
can be found. Furthermore, as there are no generally valid            y-axes (blue) : level of improvisation. The position on
and commonly accepted rules for the design and use of              the y-axes indicates overall how much improvisation is
animated scores, a strict categorization using Vickery’s           needed to perform the score. It is very likely that
terms is difficult. Therefore I propose a three dimensional        associative score requires a lot of improvisation by the
coordinate system, where scores can be positioned in a             performer. Nevertheless there are associative scores,
more flexible manner. For instance a scrolling score can           where very few musical parameters are clearly mapped
be a rather associative score that works instructive and is        with graphical parameters. For instance performers
actional. Or anything in between. Hence, this typology             simply play, when graphics are moving. On the other
does not say anything about the visual appearance or the           hand an instructive score can be very precise with certain
usability of the score.                                            parameters while other parameters need to be improvised.
                                                                      z-axes (green) : tonal - actional. If not specified by the
                                                                   composer, the distinction between tonal and actional can
                                                                   be sometimes difficult. Tonal and actional refers to
                                                                   whether a graphic concerns sound or the means of
                                                                   execution. In other words, tonal graphics describe what to
                                                                   play, while actional graphics indicate when to play or
                                                                   what to do. Again the example of the clock in figure 2.
                                                                   This score is rather actional. The color refers to the
                                                              35
instrument group involved. For the music itself, shapes,               German communication theorist Heinz Kroehl,
colors and motion have no meaning. What to play is not              discusses sign systems and visual communication in
indicated. The example SYN-Phon in figure 4 is tonal and            connection to semiotics and the theories of Charles
actional. The red play-head indicates when and how long             Sanders Peirce [14]. According to Kroehl there are three
to play, while at the same time, for instance the white             major communication systems : Art, everyday life and
curvy line at the right side of the picture also indicates a        science [11]. The everyday life system refers to real
kind of slow vibrato.                                               objects that surround us. It is not applicable when
                                                                    discussing music notation. Things have a name and we
      VISUAL COMMUNICATION PROCESS                                  can assume that we are understood by others if we use the
The visual communication process describes how                      right name for the right object. When I say “bread”,
graphical elements (e.g. staff and notes on paper or                everybody, capable of English language, will know what
motion graphics on the screen) are understood by the                I mean. In the scientific system, signs refer to definitions
receiver (e.g. a violin player). Understanding the visual           and rules. Staff notation consists of a system of specific
communication process of animated notation is crucial               rules, syntax and modes that need to be learned and
for understanding animated notation itself. Many                    understood to be able to apply them for musical
problems derive of misconceptions and wrong                         performance. In other words, there is a (pre-)defined
expectations about how information, like playing                    connection between sign and sonic result. This
instructions, are communicated in animated notation. An             connection was shaped through the centuries, from
example : as mentioned in paragraph 2, avant-garde                  neumes in the early middle-ages to western staff notation,
composers lost interest in graphic notation as they could           that we know and use today. Someone able to read staff
not establish a new normative graphic canon. This loss in           notation knows exactly which key to press on a piano
interest had several reasons that can’t be discussed in             keyboard when reading a specific note in a score e.g. a
detail here. However, one important point was exactly               C4. Another musician reading the very same score will
this misconception of the visual communication process.             therefore press exactly the very same key on the piano
Avant-garde composers regarded graphic notation as the              keyboard when reading C4. To interpret this C4 as a
successor of regular staff notation [5]. Therefore they             completely different pitch and therefore pressing any key
assumed that it would work the same way. However there              apart the C4 would be regarded as wrong. Therefore the
is a disparity in the communication process of western              transfer of knowledge, the visual communication process
staff notation and animated notation. Animated notation             in staff notation can be called scientific according to
consists of abstract graphics or objects in motion. Usually         Kroehls distinction [11]. Animated notation works
it is a video or in other words moving pictures. According          entirely different. The interpretation of one graphic could
to visual communication theory, the logic of an image (or           sound different every time it is performed. Opposite to
a video) is different from the logic of a text. It is not           staff notation, animated notation operates in the artistic
bound to a certain framework or rules. Therefore we                 system [11]. The artistic system conveys possibilities. It
cannot read and understand a picture in the same way as             is not possible that two people, in our case musicians,
we would read and understand a text [13]. Pictures cannot           interpret or understand a graphic in exactly the same way
be read. They can only be analyzed and interpreted. The             and thus play identically. An animated notation is an
more unspecific, unclear or abstract the image, the more            invitation for composers and performers to start their own
sketchy and difficult the interpretation. In this context,          so called mapping process. They need to connect or map
there is no right or wrong interpretation as long as it is          visual attributes with sonic attributes. In staff notation the
coherent and comprehensive. Surely scores in staff                  mapping by composer and performer are basically
notation need also a certain level of interpretation. Still         congruent. In animated notation the mapping process is
staff notation can be read. Similar to a text using words,          done individually, first by the composer and then by the
one has to learn signs, modes and rules of staff notation           performer.
first, to be able to read and execute them. Therefore the              It is important to understand the peculiarities of
visual communication process of animated notation and               animated notation in the visual communication process to
the visual communication process of staff notation work             be able to comprehend its advantages and disadvantages
entirely different. In consequence, avant-garde composers           as a tool for composition. Animated scores are intuitively
were disappointed of the potential of graphic notation              applicable. Any musical parameter, like pitch, dynamics
regarding the "storage" of a musical idea, because a score          or even timbre and any other playing instruction can be
could be interpreted in so many different ways. Their               conveyed. Animated notations can be simple and utilized
desire to establish a new normative canon had to remain             by children and musically untrained people. On the other
unfulfilled.                                                        hand, animated scores can be quite sophisticated and
                                                                    require experienced and skilled musicians. The
                                                               36
advantages of animated notations are at the same time the           comprehendible the c-mapping, the more definite the
reasons for its drawbacks. This type of notation cannot             score and the less interpretation work (and improvisation)
store music in the way staff notation does. It is not               by performers is required. The p-mapping can be also
possible to communicate distinct pitches, harmonics or              supported using additional guidelines. In those guidelines
rhythm in a way that they can be repeated in a similar              the composer talks about the work itself, clarifies how to
manner in each performance. Still animated notation is              read the score, explains the meaning of certain graphics
music notation. It does not lead to a random performance            or offers other means to facilitate the interpretation and
or purely free improvisation. The composer defines the              mapping process for the performer. For instance one
limits. Animated notation is simply a different approach            major distinction that can be made by composers and that
to music composition and interpretation.                            contemporary notation struggles with for quite some time
                                                                    (however in a slightly different context [17]) is the
1   Design, Mapping and Guidelines                                  distinction of graphics in either tonal or actional types.
The design of the score is of course a crucial part that            Tonal means the graphics convey sound characteristics.
requires some knowledge in graphic design and motion                They refer directly to the sound and its acoustic
graphics in order to be able to compose and not „to be              parameters. Actional concerns the means of playing or
composed“ by a software. In other words, it is possible             execution. Actional graphics do not convey what to play
that a lack of experience and limitations of a certain              or how it should sound but what to do or foremost when
software have a significant impact on the design process            to play. Another possibility is to map instruments to a
of a score. This influence should be strictly avoided.              certain color. Like the design of the score, the use of
However, the major difficulty in animated notation is the           additional guidelines or other explanations is of course
connection or mapping of visual and musical parameters              completely up to the composer.
[7]. Most musicians are used to western staff notation.
For them it is clear how notes should be interpreted. But           2    Two Examples
how does a red square sound compared to a green
triangle? As described before, there cannot be a clear
answer to that as animated notation communicates
artistically. As mentioned already, animated notation
needs to be interpreted and this interpretation might vary.
This leads us to the mapping process. Clef, key, lines,
bars and notes indicate precisely what (e.g. pitch) to play.
In staff notation the major mapping process has been
done already as it relies on a set of specific universally
accepted rules. In graphic and animated notation meaning
needs to be created individually by interpreting graphics.
The mapping processes, describes the creation of                    Figure 4. Screenshot of a scrolling score SYN-Phon by Candaş Şişman
meaning by connecting graphics and graphical attributes             featuring a red playhead [25]
with sounds and sonic attributes. This process is divided
                                                                    SYN-Phon by Candaş Şişman [25] (see fig.4). On
in two separate steps. First step is the mapping done by
                                                                    Şişman’s website you will find a video of the score with a
the composer (c-mapping). The composer tries to create a
                                                                    recording of a performance. There one can hear one
score, which allows comprehensible connections between
                                                                    possible interpretation of the score. SYN-Phon is a
graphics and sounds or graphics and actions.
                                                                    scrolling score, featuring a red play-head. The
Comprehensibility is the key. It is advisable to build up
                                                                    instrumentation is trumpet, cello and electronics/objects.
on previous knowledge and commonly accepted
                                                                    Şişman himself calls it a graphical notation. White
relationships. For instance western color coding, the
                                                                    graphics on a black background scroll from right to left
Cartesian coordinate system with pitch on the y-axes and
                                                                    indicating when to play and what to play. These graphics
time on the x-axes, connecting the size of graphics with
                                                                    are tonal and actional graphics at the same time. The X-
musical dynamics or utilizing the inherent motion of
                                                                    axes is clearly indicating time, while Y-axes is indicating
graphics on the screen for displaying a phrase or motive.
                                                                    a relative pitch. There is no clear indication within the
Second step is the more delicate mapping done by the
                                                                    graphics that refers to a specific instrument. Therefore it
performer (p-mapping). Now, the performer interprets the
                                                                    is up to the performers to decide who plays certain
score and tries to find connections between the visuals
                                                                    graphics or parts of the score. The image in figure 4
and playing music. P-mapping might vary significantly
                                                                    shows the very beginning of the piece. The big white ball
from c-mapping. However, the more precise, distinct and
                                                                    that just passed the play-head, was interpreted as a
                                                               37
presumably electronic, gong-like sound while the smaller
dots that follow are short strokes by the cello that become
a continues tone changing pitch according to the curves
of the line. Later the score displays several different types
of objects at the same time. They are interpreted by
different instruments. When watching the video on
Şişman’s website one can state that the score generally
works very accurately regarding the structure of events
over time. The mapping of visuals and music also works
out well. Most graphics find a comprehensive acoustic
equivalent. What can be a little distracting sometimes is            Figure 5. Screenshot of a performance documentation video featuring
the inconsistency of the mapping. For example, some                  the score of Study No.31 by Ryan Ross Smith [26]
uniquely defined graphics (dots connected with thin lines)
are played by the trumpet and the live electronics. The                                    CONCLUSIONS
cello repeats similar playing techniques and sounds                  Animated notation is an alternative approach to
although the graphics look quite versatile. Furthermore              contemporary music composition and performance. Its
performers do not interpret graphics consistently. The               intuitive applicability and the possibility to notate any
snake like line on the very right in figure 4 is played by           kind of sound source or non-musical instruments are the
the cello as a tone, slowly rising and falling in pitch.             major advantages of this kind of music notation.
Visually, the interval modulates around a kind of center             However, the visual communication process, meaning the
frequency and should be larger in the beginning of the               transfer of a musical idea in general and of playing
snake. While at the end the interval should be smaller. In           instructions in particular, is significantly different from
the performance, the cellist plays the interval modulating           regular staff notation. Animated scores cannot be read,
around a rising center frequency, which does not                     they can only be interpreted. And this interpretation
correspond properly to the visuals. It could be discussed            might vary significantly. Composers have to understand
whether this is a misinterpretation of the score by the              these differences to be able to utilize the advantages of
performer, or whether it is unprohibited by the composer             animated notation. The future development of hardware
to interpret the score more freely, though.                          and software will surely influence the evolution of
   Study No. 31 for 7 triangles and electronics by Ryan              animated notation and the possibilities to interconnect it
Ross Smith [26]. This piece belongs to the                           to other techniques. As a creative tool, it has by no means
permutation/coherent type and comes with few                         reached its limit, yet. There is still a lot to research and to
explanatory guidelines by the composer. There are seven              explore in the field of animated notation.
imaginary circles with cursors that indicate which part to
play. One cursor/circle for each triangle player. Each                                      REFERENCES
circle features four attack/mute event nodes connected by            [1] S. H. Behnen. „The Construction of Motion
an arc. The graphics are actional as they indicate when to               Graphics Scores and Seven Motion Graphics
hit a triangle and how long it should ring. The nodes and                Scores“ doctoral dissertation, University of
the arcs change over time. A standalone Max/MSP patch                    California, 2008.X. Someone and Y. Someone, The
is triggered by the score. It records the triangle sounds,               Title of the Book. Springer-Verlag, 2004.
manipulates them and plays them back automatically.
                                                                     [2] M. Betancourt, The History of Motion Graphics -
Hence, there is no need to indicate the live electronics in
                                                                         From Avant-Garde to Industry in the United States.
the score. The animated notation hardly requires any
                                                                         Rockville : Wildside Press, 2013
interpretational work by the performers. The way the
score is designed indicates directly that the piece is about         [3] J. Cage, Notations. New York : Something Else
structure or patterns respectively. The patterns change                  Press, 1969.
over time while the overall form of the piece remains the            [4] N. Collins, The Cambridge Companion to Electronic
same. The score is very intuitive. With very few                         Music. Cambridge MA : University Press, 2007.
explanations even musicians with limited skills are able             [5] D. Daniels, S. Naumann, See this Sound -
to perform the work in a satisfactory way. Since the score               Audiovisuology Compendium. Cologne : Walther
is instructive, the graphics are actional and not much                   Koenig, 2009.
improvisation is demanded, the score constitutes a kind
                                                                     [6] C. Dimpker, Extended Notation - The depiction of
of minimal music approach that unfolds vividly how
                                                                         the unconventional. Zürich : LIT Verlag, 2013.
simple and precise animated notation can work.
                                                                38
[7] C. Fischer, “Motion Graphic Notation - a tool to            [19] E. Thomas, Darmstädter Beiträge zur neuen Musik -
    improve live electronic music practice”. Emille                  Notation. Mainz: B. Schott, 1965.
    Vol.11 - Journal of the Korean Electro-Acoustic             [20] L. Vickery, “The Evolution of Notational
    Music Society. 2013.                                             Innovations from Mobile Score to Screen Score”. In
[8] C. Gresser, “Earle Brown’s Creative Ambiguity and                Organized Sound 17(2). Cambridge MA: University
    Ideas of Co-Creatorship” in Selected Works.                      Press, 2012.
    Contemporary Music Review 26 (3), 2007, pp. 377-            Online Resources
    394.
                                                                [21] Artlicker Blog. December 1952. Retreived from
[9] E. Karkoschka, Das Schriftbild der neuen Musik.                  https://artlicker.wordpress.com/tag
    Celle: Hermann Moeck Verlag, 1966.                               /earle-brown/ on January 8, 2015.
[10] H. Keazor, T. Wübbena, Rewind Play Fastforward -           [22] Dabbledoo Music. Activities – The Clock. Retreived
     The Past, Present and Future of the Music Video.                from
     Bielefeld: transcript Verlag, 2010                              http://beta.dabbledoomusic.com/cloc
[11] H. Kroehl, Communication design 2000 - A                        k/level1-section2.html on January 8,
     Handbook for All Who Are Concerned with                         2015.
     Communication, Advertising and Design. Basel:              [23] Umeå University.    Voices   of   Umeå    Project.
     Opinio Verlag AG, 1987.                                         Retrieved from
[12] A. Logothetis, Klangbild und Bildklang. Vienna:                http://www.estet.umu.se/konstnarlig
     Lafite, 1999.                                                  forskning/ on January 8, 2015.
[13] M. Müller,    Grundlagen         der      visuellen        [24] Palsson, Pall Ivan. Animated Notation. Retrieved
     Kommunikation. UVK, 2003.                                       from
[14] C. S. Peirce, Phänomen und Logik der Zeichen.                   http://animatednotation.blogspot.co
     Frankfurt an Main: Suhrkamp Verlag, 1983.                       m/ on January 8, 2015.
[15] T. Sauer, Notations 21. London: Mark Batty, 2009.          [25] Şişman, Candaş. SYN-Phon. Retrieved          from
[16] R. M. Schafer, The Composer in the Classroom.                   http://www.csismn.com/SYN-Phon                  on
     Scarborough: Berandol Music Limited, 1965.                      January 8, 2015.
[17] C. Seeger, “Prescriptive and Descriptive Music-            [26] Smith, Ryan Ross. Complete Studies. Retrieved from
     Writing.” In The Music Quarterly Vol.44. Oxford:                http://ryanrosssmith.com/ on January 8,
     University Press, 1958.                                         2015.
[18] K. Stockhausen, Nr.3 Elektronische Studien - Studie
     II. London: Universal Edition, 1956.
                                                           39
                      AN ATOMIC APPROACH TO ANIMATED MUSIC
                                    NOTATION
                                                                             40
mobility largely conceptual, not perceptibly actualized               foundational terminologies with which to describe the
[2].                                                                  global functionalities of dynamic scoring techniques,
    Within these constraints, certain dynamic scoring                 including of course those represented by the wide variety
practices present problematic actualization models. The               of notational practices 3 . Lindsay Vickery has most
scroll scores of Andy Ingamells feature long strips of                recently extended existing score distinctions to include
paper, populated by small, multicolored circles that                  the Rhizomatic, 3D, and Animated scores respectively,
represent sonic events. In performance, the unrolled scroll           distinctions based in part on their high-level functionality
is physically pulled, or scrolled, past the ensemble by two           and visual design. What is of primary interest in
assistants. While the element of human interaction is                 Vickery’s current project is the investigation into the
clearly present, the assistants are not performers per se,            perceptible qualities of the dynamic score, including an
but simply provide the mechanics necessary toward                     in-depth account of sight-reading studies, contingent on
Ingamells’ dynamic requirements autonomous of the                     the “natural constraints based on the limitations of human
performers, the theatricality of it all notwithstanding.              visual processing,” and the impact these constraints may
     Similarly, works that involve real-time human-                   have on communicative clarity, symbolic and functional
computer interaction to influence the score, including                design [6]. Similarly, David Kim-Boyle has recently
Harris Wulfson’s LiveScore, in which the audience,                    investigated issues regarding the impact notational design
through their interaction “becomes a part of the                      may have on the relationship between score functionality
performance,” but “never exactly cross over into the                  and audience perception. [7]. These observations begin to
‘proper’ domain inhabited by the ensemble performers”                 enhance the distinction between not only high-level
[3], or Nick Didkovsky’s Zero Waste, in which the                     dynamic       scoring     approaches,      and      low-level
pianist in tandem with the score application creates “the             functionalities that lead to their actualization, but suggest
composition through the act of performance” clearly                   that analytics regarding the functional and perceptible
displays actualized notational dynamism in real-time [4].             effectiveness can be assessed at the symbolic and micro-
The performers do not lead in the conventional sense, but             functional level. To this end, an in-depth, low-level
are led through the score by an actualized dynamic                    account of AMN specifically is largely absent, its
process, interactive or otherwise. Returning to                       admittedly pedantic particulars assumed, rendering the
Stockhausen, Klavierstucke XI (or any conventional score              term AMN itself unfortunately colloquial.4 I believe that
for that matter) may be considered dynamic in terms of                to suggest particular delineations and definitions will lead
its mobility [2], but the cursor, represented here by the             toward a more rich discourse regarding AMN
performer’s eye, is virtual, not actual, or actualized.               specifically, and distinguish AMN as a distinct
Simply put, agency lies primarily with the performer to               methodology within the broad category of dynamic
activate or dynamize the conventional score, whereas the              scoring, while also, through a deliberate focus on the
dynamic score has agency over the performer; movement                 author’s own creative practice, suggest that these
is perceptible, not of the eye, but to the eye. While further         distinctions may be limited to particular compositional
discussion of the various distinctions between methods of             practices. To this end, a reductionist, atomic approach
real-time scoring practices may be warranted, it is beyond            will be used to unpack and define the low-level elements
the scope of this paper. However, within the dynamic                  of AMN. This reductionist analysis will not focus on
score exist the potential for a variety of dynamic                    musical content or concept, but target the nuts and bolts,
representations. AMN will be considered as a form of                  so to speak, including prevalent symbologies and their
real-time notation in which the actualization of contact              respective dynamisms, symbol design and interaction,
and intersection, which provide perceptible indications as            and an examination of actualized indication, including
to the specific temporal location of sonic events, are its            contact and intersection. As a global mapping of AMN
primary distinguishing feature.                                       practices is beyond the scope of this paper, those
                                                                         3
     BASIC ELEMENTS OF ANIMATED MUSIC                                      Scholarly contributions can be largely attributed to the work of Cat
                NOTATION                                              Hope, Lindsay Vickery, David Kim-Boyle, Jason Freeman, Pedro
                                                                      Rebelo and Gerhard E. Winkler, among many others, while their artistic
     “A graphical method is successful only if the decoding is        contributions, and those within the field of dynamic scoring in general
     effective. No matter how clever and how technologically          [Páll Ivan Pálsson’s animatednotation.blogspot.com and the authors
     impressive the encoding, it fails if the decoding process        animatednotation.com provide numerous examples] continue to make
     fails." – Cleveland and McGill [5]                               significant contributions.
                                                                         4
                                                                             It has been my admittedly contrary intention with
Introduction                                                          animatednotation.com,              following       the      model      of
                                                                      animatednotation.blogspot.com, to be inclusive regarding
Several high-level analyses and aesthetic reflections                 the diversity of dynamic scoring practices, regardless of those low-level
regarding the ontology of dynamic scores have provided                symbolic and functional requirements I will put forth here.
                                                                 41
notational approaches that most clearly represent a clearly
defined symbology, perceptible functionality, and
actualized indication will be prioritized.
    The symbolic elements of AMN, with which dynamic
functionalities are actualized, can often be reduced to four
increasingly complex entities: geometric primitives
[primitives], semantically and visually integrated
primitives [compound primitives], structures, and
aggregates.
Primitives
A primitive is an irreducible static or dynamic symbol. 5 A
primitive is irreducible when no aspect of its design can
be removed without limiting its intended communicative
potential. Channeling Goodman to some degree, Vickery
writes “One important factor contributing to the efficacy
of notation is semantic soundness – the degree to which
the graphical representation makes inherent sense to the
reader, rather than necessitates learning and                                 Figure 1. y = f(x) (2012) by Þráinn Hjálmarsson [detail] Example of
memorization of new symbols.” [6]. To this end, a                             sonic events represented as static circular nodes, their temporality
                                                                              denoted by the crossing of the dynamic attack lines/swiping play heads.
primitive, which may be of any shape or size, is often cast
as small geometric primitives [circles, squares,                                 Two or more primitives can be seamlessly combined
rectangles, lines (straight and curved)], favoring                            in such a way that a secondary primitive enhances or
extensible clarity over verbose ambiguity. [7] As                             embellishes the primary, creating a compound primitive.
Gerhard E. Winkler notes, “the different parts of the score                   For instance, a vertical line intersecting a circular
to be reduced to a number of elements, which can be                           primitive in order to clarify the moment of intersection
learned and ‘trained’ in advance, and which can be seized                     with a static attack line.
with ‘one glance’ immediately during a performance.” [8]                           The visual qualities of a primitive, including size and
     A stationary, or static primitive is referred to as a                    color, can also be modified to denote changes to the sonic
node, while a stationary or static line is referred to as an                  qualities of the corresponding sonic event, insofar as it
attack line or play head. A non-line dynamic primitive is                     can still be ‘decoded’ by the performer [5]. Changes of
referred to as a cursor or attack cursor, while a dynamic                     this type are, from the visual perspective, necessarily
line is often referred to as a dynamic attack line or a                       linked to the ontology of the irreducible primitive, and so
swiping play head (see Figure 1) [9]. Screen boundaries,                      would not be considered compound (see Figure 2).
the physical (or projected) limitations of the score may or                        Cases where information regarding the qualities of a
may not be treated symbolically, but are necessarily                          particular sonic event as prescribed by a primitive appear
static.6 Representative images [frogs, spaceships, etc.] are                  in conjunction with the primitive, but not visually
less common, and often serve higher-level purposes, as a                      embedded within it, can still be considered a compound
visual representation of a particular action to be                            primitive, so long as it clearly references a single instance
performed or instrument to be activated, as opposed to                        of a primitive (see Figure 3), as opposed to a modifier,
the more robust, contextually-variable symbol.7                               which applies to two or more primitives, and is thus not
                                                                              integrated.
                                                                                 Regions describe a subset of both static nodes and
   5                                                                          dynamic attack cursors, and are represented by a large
            The focus here is on those symbols abstracted from, or
distinct from conventional symbologies, but this should not presuppose
                                                                              primitive, often functionally integrated by intersecting a
their exclusion in practice.                                                  line (see Figure 5), or its intersection by a line (see Figure
   6
            This refers to the physical limitations of the score, not         6). Regions generally represent an event that is sustained,
boundaries that may result from letterboxing, for instance, which may         and/or modified over time. In K. Michael Fox’s Accretion
be treated dynamically.                                                       (2014), the ADSR curve is cast as a notational region,
   7
            In The Limitations of Representing Sound and Notation on
                                                                              representing relative dynamics in its relation to the static
Screen, Lindsay Vickery develops this through a continuum ranging
from the spectrogram [detailed image] to the text score [distilled
                                                                              attack line and vertical boundaries (see Figure 4).
image]. References to frogs and spaceships is in regards to the
particularly interesting experiments in notational design by the
S.L.A.T.U.R. collective in Reykjavík, Iceland.
                                                                         42
                                                                               Figure 4. Accretion by K. Michael Fox (2014).
                                                                          43
Structures                                                                        Aggregates
A structure refers to two or more primitives in some                              An aggregate is the collection of primitives, structures,
interrelated relationship. This may be represented by an                          and their respective dynamisms that corresponds to a
object, for example a line connecting two circular                                single player. Aggregates may be visually displaced or
primitives (see Figure 7 [left]), or created through some                         integrated, and may be functionally autonomous (see
dynamic relationship between symbols (see Figure 7                                Figure 9) or dependent regarding its relation to other
[right]). A structure may contain one or more primitives                          aggregates (see Figure 10). Aggregates range in
that are not functionally symbolic, but clarify                                   complexity from a single, simple structure (see Figure 9)
functionality and “semantic soundness.” [6] Many of the                           to a set of integrated structures, each comprised of several
author’s radial scores incorporate a rotating line that                           primitives (see Figure 11 & 12).
connects a rotating attack cursor to a central static node.
This line has negligible value regarding its notational
functionality, but clarifies moments of contact and
intersection (see Figure 8). At the lowest level, a single
structure may contain the elements necessary to produce
an actualized indication of contact or intersection, an
AMN capable of determining the temporal location and
quality of a sonic event. To this end, an instantiation of
AMN will contain at least one structure, which will in
turn contain two or more primitives, at least one of which
will exhibit dynamic qualities (see Figure 7 [right]).
                                                                             44
  It is important to note that autonomous aggregates that                        Traversal Duration
appear to be visually integrated with other aggregates                           Traversal duration refers to the time it takes for an attack
does not necessarily imply any functional integration,                           cursor to move from its starting point to the point of
dependence or influence (see Figure 9).                                          contact or intersection. Traversal offset refers to the
                                                                                 distance a cursor, or line, travels over the course of the
                                                                                 traversal duration (see Figure 13). Cursor traversal must
                                                                                 be perceptible, or trackable, in order that the performer
                                                                                 can clearly gauge the arrival of an incoming cursor and
                                                                                 prepare for the moment of attack, and traversal duration
                                                                                 and cursor offset must be considered in conjunction
                                                                                 toward this end. Lindsay Vickery considers these issues
                                                                                 in depth, suggesting that “at scroll rates greater than 3 cm
                                                                                 per second the reader struggles to capture
                                                                                 information” [6]. A concatenation of nodes or cursors
                                                                                 may extend the potential ranges of both the traversal
                                                                                 duration and cursor offset, due in part to the regularity or
                                                                                 feel that concatenation may evoke (see Figure 8).
                                                                                 Furthermore, these particular limitations of legibility can
                                                                                 be exploited to create, as Winkler notes “’stress’ or even
                                                                                 ‘frustration’” for the players, a music and theatrical
                                                                                 disruption [8], and explore the extremities of such real-
Figure 11. Study no. 31 (2013) by Ryan Ross Smith. Each aggregate                time practices [10].
(including one of the seven concentric circles, four dynamic ‘barbells,’
and single rotating attack cursor) is functionally autonomous, but
visually integrated, in that each aggregate seems to encapsulate smaller
aggregates.
   Furthermore, the distinction between autonomous and                              “...the true nature of things may be said to lie not in things
                                                                                 themselves, but in the relationships which we construct, and
dependent aggregates is necessarily independent from
                                                                                 then perceive, between them.” – Terence Hawkes [11]
any global functionality imposed by the score generator,
as all elements of the score are necessarily dependent on
the score generator for their actualization.
                                                                            45
   Actualized indication refers to a particular                               with at least one exhibiting dynamic qualities. The
methodology by which the temporal location of a sonic                         moment at which contact occurs signifies that some sonic
event can visually represented with a high degree of                          event is to be performed by the player.
specificity. While the history of notation provides myriad                         One of the most common methods of contact includes
ways to locate a sonic event, this section will deal with                     a [dynamic] attack cursor making surface contact with a
only those that best distinguish those functionalities                        [static] node or play head. In these cases, contact occurs
necessary to AMN: contact and intersection.                                   at the moment the cursor’s boundary collides with the
     Contact is the “union or junction of surfaces” [12],                     node or play head’s boundary, followed by the cursor
and ‘surfaces’ will here refer to the boundaries of any                       reversing its previous trajectory, appearing to bounce of
object, visually defined by its own delineated boundaries                     the node, moving away in some other trajectory or simply
[13]. In Features and Objects in Visual Processing, Anne                      disappearing. The cursor will not penetrate the node’s
Treisman writes “…boundaries are salient between                              boundary, and often follows a consistent trajectory (see
elements that differ in simple properties such as color,                      Figure 14).
brightness, and line orientation but not between elements
that differ in how their properties are combined or
arranged” [14]. In other words, in order for two objects,
or symbols as it were, to appear to come into contact with
one another, their respective visual representation must
be well defined, differentiated, and at least one must
demonstrate dynamic qualities.
     The physical gestures of performers and conductors
alike most clearly represent the concept of contact as a                      Figure 14. Contact: Dynamic attack cursor and static play head.
meaningful, perceptible action. The conductor’s baton
‘bouncing’ off a virtual or imaginary boundary elicits a                      Intersection
predetermined response based on score location and
                                                                              Intersection, as an actualized indicator, consists of a
intensity; The violinist’s quick breath and head snap cues
                                                                              [dynamic] attack cursor intersecting a [static] node or
an upcoming unison entrance; the guitar player jumps off
                                                                              play head. This functionality requires the cursor to
the drum kit at the correct time in order to make contact
                                                                              penetrate the node or play head, the cursor often
with the floor at the following downbeat. These physical
                                                                              continuing on in the same direction following intersection
gestures of contact, their necessary ‘setup,’ as (un)subtle
                                                                              (see Figure 15). Intersection is often utilized for sustained
as they may be, within virtual and physical constraints,
                                                                              or continuously modified events, and is regularly
more or less clearly convey a bundle of performance
                                                                              represented by a region. For continuously modified
instructions in reference to, but beyond any conventional
                                                                              events, the alignment of the centroid is not applicable, but
notion of notation; in other words, the speed at which the
                                                                              the position of the attack point (line or node) within the
violinist snaps her head back, and the amplitude of ‘sniff
                                                                              region. In Cat Hope’s Cruel and Usual (2011), sustained
volume’ may determine not only the moment of attack,
                                                                              tones are represented by regions in the form of straight
but relative dynamic, tempo, and other less quantifiable
                                                                              and curved lines, their position in relation to the fixed
parameters (smooth or jagged, heroic or melancholic,
                                                                              attack line determining the relative degree to which the
etc.); A set of dynamic qualities represented by
                                                                              current pitch is detuned (see Figure 5).
perceptible movement.
                                                                                  Related to this functionality is the aforementioned
     The moment of contact as a notational indicator is not
                                                                              dynamic attack line, or swiping play head, in which the
new, nor dependent on digital media, 8 but does suggest a
                                                                              nodes are rendered static, the moment of attack
method whereby these interactions can be actualized with
                                                                              determined by the attack line intersecting the node,
a high degree of temporal specificity, even in a generative
                                                                              although the general functionality is similar (see Figure
context, and effectively transfer temporal agency from
                                                                              16) [5].
the performer to the score.
     Contact in the context of AMN is represented by the
collision of two symbols, actualized as surface juncture.
Contact can occur between objects of any shape or size,
   8
     From Max Fleischer to Karaoke, player piano rolls to Guitar Hero,
contact and intersection have been the basis for a variety of media
applications of real-time notational approaches throughout the 20th
century.
                                                                         46
Figure 15. Intersection: Dynamic attack cursor and static play head.
                                                                               Figure 18. Study no.16 [NavavaN] (2013) by Ryan Ross Smith. Red
                                                                               rectangles [attack cursor] converge on the black rectangles [static node]
                                                                               to denote the moment of attack.
Figure 16. Intersection: Dynamic attack line, or swiping play head, and
static node. Similar to the previous example, an event occurs at the
moment the line aligns with the node’s center.
                                                                          47
Acknowledgments                                                    [11] T. Hawkes, Structuralism and Semiotics. Berkeley:
                                                                        University of California Press, 1977.
Thank you to those composers and researchers who have
inspired my initial and continued interest in this field of        [12] “Contact,”      Merriam-Webster’s         Collegiate
practice, in particular the work of Páll Ivan Pálsson,                  Dictionary,   accessed     January      31,   2015,
Jesper Pedersen, Cat Hope, Lindsay Vickery, David Kim-                  http://www.merriam-
Boyle, K. Michael Fox, and the composers of the                         webster.com/dictionary/contact.
S.L.A.T.U.R. collective.                                           [13] J. Feldman, “What is a visual object?,” TRENDS in
                                                                        Cognitive Sciences 7, no. 6, 2003, p. 252.
                    REFERENCES                                     [14] A. Treisman, “Features and objects in visual
[1] A. Clay, J. Freeman, “Preface: Virtual Scores and                   processing,” Scientific American 255, no. 5, 1986,
    Real-Time Playing,” Contemporary Music Review                       pp. 114-125.
    29, no. 1 (2010): 1.
                                                                                        WORKS CITED
[2] L. Vickery, "Increasing the mobility of
    Stockhausen’s Mobile Scores." 2010.                                Y = f(x) by Þráinn Hjálmarsson (2012):
    http://www.slideshare.net/lindsayvi                                https://vimeo.com/96485535
    ckery/increasing-the-mobility-of-                                  Study no. 10 by Ryan Ross Smith (2012):
    stockhausens-mobile-scores-2010-                                   http://ryanrosssmith.com/study10.htm
    lindsay-vickery                                                l
[3] G. Douglas Barrett, M. Winter, “LiveScore: Real-                 SPAM by Luciano Azzigotti (2009):
    Time Notation in the Music of Harris Wulfson,”                   https://www.youtube.com/watch?v=9U0J
    Contemporary Music Review 29, no. 1, 2010, pp. 55-             b-7jRs4
    62.                                                              Accretion  by    K.     Michael   Fox (2014):
[4] G. Hajdu, N. Didkovsky, “On the Evolution of                   http://www.kmichaelfox.com/works/accre
    Music Notation in Network Music Environments,”                 tion/
    Contemporary Music Review 28, nos. 4/5, 2009,                    Cruel and Usual by Cat Hope (2011):
    pp. 395-407.                                                   https://www.youtube.com/watch?v=CtNccM
[5] W. S. Cleveland, R. McGill, “Graphical Perception              uPg4w&feature=youtu.be
    and Graphical Methods for Analyzing Scientific                    Spooky Circle by Jesper Pedersen (2012):
    Data,” Science 229, no. 4716, 1985, pp. 828-833.                  https://www.youtube.com/watch?v=NN5Z
[6] L. Vickery, “The Limitations of Representing Sound             9c5lrac&feature=youtu.be
    and Notation on Screen,” Organized Sound 19, no. 3,               Study no. 40.1 [pulseighteen] by Ryan Ross Smith
    2014, pp. 215-227.                                             (2014):
[7] D. Kim-Boyle, “Visual Design of Real-Time                      http://ryanrosssmith.com/study40_1.htm
    Scores,” Organized Sound 19, no. 3, 2014, pp. 286-             l
    294.                                                              Study no. 8 [15 percussionists] by Ryan Ross Smith
                                                                   (2012):
[8] G. E. Winkler, “The Realtime Score. A Missing-
                                                                   http://ryanrosssmith.com/study8.html
    Link     in     Computer-Music     Performance”
    (Proceedings of Sound and Music Computing 2004,                   Study no. 31 by Ryan Ross Smith (2013):
    IRCAM, Paris, FR).                                             http://ryanrosssmith.com/study31.html
                                                                      Study no. 40.3 [pulseven] by Ryan Ross Smith (2014):
[9] C. Hope, L. Vickery, “Screen Scores: New Media                 http://ryanrosssmith.com/study40_3.htm
    Music     Manuscripts”    (Proceedings    of   the
                                                                   l
    International Computer Music Conference 2011,
                                                                      Study no. 16 [NavavaN] by Ryan Ross Smith (2013):
    University of Huddersfield, UK, July 31 – August 5,
    2011).                                                         http://ryanrosssmith.com/study16.html
                                                              48
   SEMAPHORE: CROSS-DOMAIN EXPRESSIVE MAPPING WITH LIVE
                        NOTATION
                                                        Richard Hoadley
                                                     Anglia Ruskin University
                                                [email protected]
                                                                         49
                                                                      major factor in communication is the requirement for the
                                                                      proper parsing of grammar through algorithms.
                                                                        This apparently simple idea has been highly influential
                                                                      as well as controversial. In 2014 the press reported ‘the
                                                                      first computer ever to pass the Turing Test’ [15] - a claim
                                                                      quickly disputed [16]. Eugene Goostman [17] joins a long
                                                                      list of attempts at the algorithmic generation of meaning,
                                                                      stretching back through chatterbots such as ELIZA [18].
                                                                        More recently there has been interest in the generation of
    Figure 1. Dynamic notation from Semaphore, scene 1                robotic or virtual algorithmic creatures, for instance exam-
                                                                      ples of real-time animation Larry the Zombie [19], or Milo
                                                                      from Kinect [20].
thetic and practical considerations: the musicians are quite
                                                                        Through these examples and others it is clear that live ac-
happy to encounter the music in this way). INS CORE also
                                                                      tion requires a particular aesthetic - books, films, art and
allows considerable control over the presentation of nota-
                                                                      music are all based on planning or improvisation. Live
tion, an important feature for those composers who, like
                                                                      action/live art tends to be based on forms of guided impro-
the author, find the appearance of notation reflects its ex-
                                                                      visation or semi-improvisation with forms that were not
pressivity (while being mindful of notation devotee Cor-
                                                                      previously available, so allowing hybrid creative structures
nelius Cardew’s warning that ‘a musical notation that looks
                                                                      involving group and real-time coordination through gener-
beautiful is not a beautiful notation, because it is not a
                                                                      ative notations.
function of a notation to look beautiful’ [5]).
                                                                      1.3.2 Live notation in music
1.3.1 Live text
                                                                      Music, drama and dance are temporal art forms having sig-
Unsurprisingly, ‘liveness’ has different consequences in dif-         nificant improvisatory and/or interpretive components.
ferent domains. For those working in the domain of text the             Over the last fifty years particular emphasis, even rever-
ability of Google Docs to update material synchronously               ence [21], has been placed on the ‘urtext’ - most obviously
for all users is literally a demonstration of the editing of          in ‘classical’ musics where the score is, or has become, a
material as ‘performance’. Inevitably some creative artists           fundamental element. This contrasts with many popular
have used this platform as a way of interrogating particular          musics and jazz where the skilful variation or personal-
methods of creating, viewing and performing with text [6];            isation of an existing ‘standard’ is frequently considered
others have used features of Skype and Twitter in similar             central (witness Bob Dylan’s own increasingly inventive
ways [7].                                                             variations in his performances of Like a Rolling Stone). In
  Book publishing tends to emphasise the finished product             classical musics performers have been vilified for veering
- the messy processes of writing and editing are obscured             too far away from the original instruction or a ‘classic’ in-
by the impeccable published item. There have been a num-              terpretation [22]. In forms where scores are less definitive -
ber of projects making use of electronic and networked re-            pop, jazz and other oral, aural and more improvised forms,
sources, including novel-writing as performance [8] and as            ‘liveness’ is not in the form of notation, but in musical sig-
real-time performance [9], writing as performance art [10],           nals passing between musicians. (It may be significant that
writing as a contest against time [11] and against other au-          so-called tribute bands - replicas of older pop acts - now
thors on-line in the Penguin Books competition ’We Tell               exist for whom authenticity is now a main criteria.) All of
Stories’ [12].                                                        these factors make the live generation of music notation a
  Of course text can also be created and manipulated gen-             particularly hybrid form. Classically-trained instrumental-
eratively rather than collaboratively. This is less preva-            ists are readily able to create dynamic and exciting perfor-
lent in text-based media (although ‘off-line’ methods such            mances from carefully constructed live notation - they are
as Oulipo [13] are well known and understood). One of                 used to creating performance in deplorably short spaces
the first practical references to the possibility of the algo-        of time from fearsome scores, after all. In this case, the
rithmic generation of meaningful text was by Alan Tur-                live notation should not be too difficult and proper thought
ing [14]. In this famous test Turing replaces the question            must be given to its format and presentation (how to judge
“can machines think” with “are there imaginable digital               when to ’turn a page’ - whatever that means digitally - for
computers which would do well in the imitation game?”                 instance). The author’s experience is that under these con-
(The imitation game is one possible implementation of the             ditions musicians find performing from live scores exciting
Turing test.) While the test is for intelligence, in effect a         and exhilarating [23].
                                                                 50
  In the technical operation of algorithmically structuring           1.3.3 Live notation in dance and graphics
notation it is of prime importance to achieve a satisfac-
tory balance between the maintenance of musical style and             Prominent extant forms of dance/movement notation in-
the creation of notation straightforward and clear enough             clude Labanotation, or Kinetographie Laban by Rudolf
to enable the musician to give an expressive performance              von Laban [34], Benesh Movement Notation (graphical rep-
even when almost sight-reading. For this reason the author            resentation of human bodily movements), Eshkol-Wachman
has made the choice to stick primarily to common practice             Movement Notation (graphical representation of bodily move-
notation. In addition, the notation has been kept as simple           ments of other species in addition to humans, and indeed
as possible bearing in mind the modernist style of the mu-            any kind of movement (e.g. aircraft aerobatics)) as well as
sic. These choices have been made in order to facilitate the          others. These forms are primarily graphical reflecting their
skills of classically-trained performers who have, through            main focus on movement rather than textual or symbolic
years of experience, a particular relationship with notation          meaning.
and they are able to transform it into dynamic, expressive              While some forms of music notation have had a long and
performance.                                                          varied history, dance notation has not been so prominent.
                                                                      One of the reasons for this lies in the different functions
  Nonetheless, the live generative use of music notation has          that exist for dance notation. It is usually considered as a
been generally less visible. While software for music no-             way of storing and passing on existing dances rather than
tation has been developing for many years (Notator and                as a way of expressing oneself, making the adoption or
Finale in 1988, Sibelius publicly released in 1993), there            even exploration of dance or movement notation more dif-
has been little apparent interest in methods of using nota-           ficult. It is rarely used in the communication of new dance
tion both generatively and in live environments. More re-             work, and in spite of Albrecht Knust’s suggestion that in
cently, Lilypond (e.g. [24]) has been used extensively as a           Labanotation “the symbols must speak directly to the eyes
platform for non-real-time generation of notation and sys-            of the reader so that he can perform the movements with-
tems such as PGWL [25] and Slippery Chicken [26] have                 out, or at least without too much, reflection” [35], there are
added very sophisticated notation facilities to computer-             questions as to how easily and quickly it can be read and
aided-composition software. As mentioned in section 1.3               digested. Text and music notations are generally so well
there are now a number of options available to composers              understood by performers that this is not a problem (al-
working in live music notation ( [2–4, 27]), although the             though it usually requires some time to ‘digest’ them (see
emphasis of both remains on computer-aided composition.               section 5)). Some musics have tests for sight-reading abil-
                                                                      ity, implying that financial considerations are very likely to
  Prominent ‘historical’ examples of live notation in music           reduce the capacity for detailed rehearsal!
include Baird [24], Wulfson [28] and Kim-Boyle [29]. The
                                                                        A further difference is that dance notation is generally
use of notation in these cases mainly consists the manipu-
                                                                      considered such a specialised field that professional nota-
lation of image files or the generation of large quantities of
                                                                      tors need to be employed, limiting its take-up in live work.
material - for instance through the algorithmic coding of
                                                                        Finally, a problem specifically associated with the live
Lilypond files [30]. However there are some more signifi-
                                                                      use of this notation is how it can be communicated to the
cant uses of live generated scores [31, 32]. Volume 29:1 of
                                                                      dancers. Most commonly this is via a data projector, but
Contemporary Music Review (2010) is given over entirely
                                                                      this limits the dancer’s movements significantly.
to a review of live notation.
                                                                        Recent developments linking live notation and dance have
  Unsurprisingly in a comparatively new field there are sig-          included a variety of instances of ‘hacking choreography’
nificant issues yet to be dealt with in the practical imple-          and ‘live coding’ involving dance and other forms of em-
mentation of live notation. These include bridging the tech-          bodied expression. While predominantly extensions of the
nical and aesthetic divide between notation and signals [31],         physical computing methods mentioned above, the use of
general complications with synchronisation and timing, prac-          live coding as a form of notation has been imaginatively
tical difficulties such as when to ‘page turn’, how to achieve        investigated by Alex McLean and Kate Sicchio in [36–38]
the correct balance between reading and improvisation as              and demonstrated in 2013 [39].
well as inherent issues such as sight-reading and how difficult-        While there are some practical problems with these sys-
to-play notation can become before it requires practice. As           tems - mainly around communicating the notation to the
Lukas Foss commented on, ”the precise notation which re-              dancer, McLean’s version of Texture, demonstrated in [39]
sults in imprecise performance” and that ”to learn to play            is both visually striking and expressive. It does however,
the disorderly in orderly fashion is to multiply rehearsal            become increasingly complex as the dance progresses, mak-
time by one hundred” [33].                                            ing interpretation a particularly vital part of the interaction.
                                                                 51
  While the present condition of dance notation can appear              the recording is analysed spectrally. The results of the anal-
to be quite frustrating, particularly in its lack of standardis-        ysis then trigger and modulate a musical phrase presented
ation, the field is open for further developments in notation           as music notation which is then played by an instrumen-
systems.                                                                talist. A dancer responds to the performed phrase with a
                                                                        physical gesture.
          2. CROSS-DOMAIN EXPRESSION                                      This set of actions might take place over a period from
                                                                        a few milliseconds to one or two seconds, or over an even
These three research streams together allow for the practice-
                                                                        more extended period of time. We find that the only sig-
led investigation of cross-domain expression. Cross-domain
                                                                        nificant latency occurs as performers consciously respond
ways of thinking are so natural to us that it is difficult
                                                                        to newly displayed notations.
to imagine expression without them. Performed music is
                                                                          Alongside its creative potential, this research enables peo-
itself a cross-domain activity utilising both physical and
                                                                        ple working in one domain to generate material in another.
mental dexterity. (Arguably the use of mixed metaphors
                                                                        These people might be expert performers in another do-
(such as my own use of the phrase ‘mental dexterity’ in the
                                                                        main or members of the public with no particular expertise.
previous sentence) is another example, as are metaphors
                                                                          There are many examples of movement-based interfaces
and analogies themselves.)
                                                                        for music, but this work is unique in its facilitation of trans-
  Writing about music often requires the use of metaphors
                                                                        lations from one domain into the notation of another: mu-
and particularly when we are seeking to analyse or describe
                                                                        sic, text, dance or graphics. The use of notation allows us
less embodied musical forms, such as acousmatic music,
                                                                        to preserve performance interpretation that many audience
we are even more reliant on other domains such as lan-
                                                                        members find so fundamental in live art.
guage and image [40].
                                                                          Of course, the creative problem of how to create mean-
  Most expressive domains themselves comprise of a num-
                                                                        ingful expression from these technical procedures remains
ber of linked sub-domains. A lot of music, for instance,
                                                                        as crucial as ever.
can be described as expression through pattern enabled by
physical effort. This research leans heavily on these inter-
dependencies, seeking to maximise expression and inter-                             4. TECHNICAL PROCEDURES
action through the exploitation of musicians’ learned per-              In the following sections ways in which the parts of the
formance skills articulated through common practice nota-               sequence described above were implemented technically
tions.                                                                  outlined in more detail.
Semaphore is a collaborative music-dance-text piece com-                The ubiquitous Microsoft Kinect (Xbox 360 version) is
posed using research which seeks to translate between ex-               used to capture a dancer’s physical movements. The soft-
pressive domains using technology. An expressive domain                 ware used for programming the audio environment, Super-
is a form of artistic expression such as music, dance or                Collider, is also used to perform some rudimentary move-
text. Uniquely, information is taken from one domain and                ment detection. Gesture recognition is not central to this
translated into another in real-time so allowing simultane-             research and the software does not seek to make precise
ous performance. The music, environment and program-                    distinctions between different gestures but it is used to de-
ming is by the author, choreography is by Jane Turner and               tect the speed and range of the movements of certain body
text is by the novelist and poet Phil Terry. The music is               parts. Effective though the Kinect is, the Loie Fuller Ap-
performed live from code in the SuperCollider audio pro-                parition dress which is used in part of the performance (see
gramming environment [41, 42], a combination of prepre-                 Figure 2) proved too concealing skeletally for the Kinect.
pared functions and structures and including some methods               For the next rehearsal, we used a bespoke ultrasound sen-
related to live coding.                                                 sor device, the Gaggle [43] to gauge proximity and move-
                                                                        ment.
3.1 A cross-domain sequence explained
                                                                        4.2 ...triggers and modulates the computer generation
Semaphore is composed of patterns of interactive cross-
                                                                        of a text phrase...
domain scenes and sequences. The following is an exam-
ple of a single synchronous sequence:                                   Figure 3 shows a screenshot from Semaphore showing the
  A dancer’s physical movement triggers and modulates                   results of a variety of text-based manipulations of the orig-
the computer generation of a text phrase, which is dis-                 inal text displayed in INS CORE using its ability to parse
played and performed. This performance is recorded and                  HTML text and formatting. The original text was prepared
                                                                   52
                                                                                   Figure 3. Semaphore, scene 3
Figure 2. Loie Fuller Apparition costume. Photo 
                                                c Chris
Frazer-Smith 2014.
                                                                    This produces verses with a gentle, somewhat zen-like
                                                                   quality, emphasising the rather surreal nature of the origi-
in collaboration with the team by the writer and poet Phil
                                                                   nal verse:
Terry especially for this performance. One of the key ques-
tions was how to achieve an expressive balance between                   Some fear to offer or seem to fear
sound and meaning in the text. Terry is well-versed in                   Soars to see the same semantic dance
Oulipo techniques [13] and was aware of many possible                    A flare or a car
technical textual procedures and their results - we wanted               Oars soar with ease or seem to soar
something focused and related to the Semaphore concept.
Eventually, we decided on material that fell in between                  Soars to see the same semantic dance
sound and semantics, and which also enabled some algo-                   The same flares through the firs
rithmic manipulation. (Apparently by chance - or euphony                 Oars soar with ease or seem to soar
- the word ’semantic’ appears in the poem, linked sonically              Seem to spore
to ’semaphore’.)                                                         Ears arms too as a sheer harm
                                                                         The same flares through the firs
      Semaphore or some are for just as elsewhere
                                                                         Seem to spore
          some are against
                                                                         Verse as shame same sheep
      Some fear to offer or seem to fear
      Afar a fir so that through the undergrowth and                 While the final part of Semaphore revolves around a pre-
          across the map                                           written poem (see section 4.3), an introductory, more ab-
      A flare or a car                                             stract section (figure 3) originally involved direct interac-
                                                                   tion between dancers and text. As an example we arranged
      Soars to see the same semantic dance                         a passage where if the movements of one of the dancers
      Oars soar with ease or seem to soar                          was faster/higher than a given threshold, a trigger is sent to
      The same flares through the firs                             an algorithm which then chooses one from a group of se-
      Seem to spore                                                lected words from the poems (such as flashing, shear, roar,
                                                                   billows, swelling, etc.).
      Ears arms too as a sheer harm
                                                                     Although the metaphors chosen here seem rather trite or
      Verse as shame same sheep
                                                                   simplistic, the scenario proved expressive, successful and
      Sham spheres or spare harems reap hope
                                                                   full of potential.
      Marsh shears or fennel ash
                                                                   4.3 ...the recording is analysed...
  When we discovered that the original poem was too short,
Terry expanded it, using a pantoum structure derived from          For the last part of Semaphore, we recorded Terry read-
the Malay pantum verse form which repeats lines in a pat-          ing the poem. As we needed to mix between dry and wet
tern, effectively doubling the original length:                    audio streams we used a recording, although the use of a
                                                                   live voice (at least in part) reading live generated text is a
      ABCD                                                         important goal.
      BEDF                                                           The software analyses the frequency and amplitude com-
      EGFH                                                         ponents of the vocal. The base frequency generates a series
      G I/A/C H J/A/C                                              of sustained chords accompanying the voice gently in the
                                                              53
                                                                                     Figure 5. Semaphore, scene 4
                                                                 54
one mappings [45, 46] - for instance an upwardly moving
arm might produce an upwardly proceeding arpeggio or
scale - or it may be used as a form of gesture - a fast move-
ment may produce a fast moving string of notes (see notes
2-5 in Figure 1 above). Equally the mapping may include
some aspects of real-world behaviours and gestures [48].
  In some cases it is not possible to conclude a musical
phrase without synchronous information, again meaning
that some form of latency is inevitable.
  Finally, the involvement of humans and human percep-
tion and notation is itself probably the greatest cause of                                  Figure 6. Universities’ week interactions
latency. Rehearsals with live notation suggest that ideally
performers need a second or so from the moment that the                                  Paper              Size (mm)      Area (mm2 )
new notation is displayed to properly digest and respond to                              A4                 210 x 297      62370
it.                                                                                      15” Screen         332 x 204      67728
                                                                                         foolscap           216 343        74088
5.2 Effects of Latency                                                                   ‘common’ size      241 x 318      76638
                                                                                         B4                 250 x 353      88250
Stimulating creative results seem to arise from these de-
                                                                                         music part         260 x 365      94900
velopmental, even compositional choices, sometimes em-
phasising a direct, easily perceivable relationship between
movement and result, sometimes confounding expectations                                     Table 1. Paper and screen sizes compared
with a melismatic flurry as if from nowhere.
  One of the difficulties some have with high levels of la-
                                                                                  enthusiastically and with none of the self-consciousness so
tency is that there is perceived to be a lack of control, even
                                                                                  typical of their parents. A video recording of these interac-
a lack of feeling of cause and effect. This implies that our
                                                                                  tions is available - please contact the author for access (see
main aim should be the creation of musical instruments in
                                                                                  Figure 6 for an example screenshot).
the best traditions of the New Interfaces for Musical Ex-
pression conference [49]. However, the design of musical                          6.2 Rehearsal and acquaintance with the system
instruments is not the main focus in this research. One aim
in Semaphore is to investigate whether expert expressive                          Feedback on all aspects of the composition and the nota-
movement can find a mapped reflection in another domain,                          tion system was gathered from the participants throughout
in this case music or text. Latency might be a feature of the                     the rehearsal process. This included two early rehearsals
systems, but is not an issue for the team. If precisely timed                     during which the author worked with one student dancer to
responses are required, solutions are easily available, such                      properly ascertain basic functionality such as sensor ranges
as strict pre-planning of rhythm, movement and display or                         and sensitivities. While the Kinect can be quite sensitive to
even the simple playback of recordings.                                           some movements it is also the case that its basic design is
                                                                                  to recognise simple bodily movements usually associated
                                                                                  with sports and gaming rather than the sometimes delicate
                 6. AUDIENCE RESPONSE
                                                                                  and gentle movements used in contemporary dance. These
6.1 Universities’ Week                                                            factors were also linked to allowances made for latency
                                                                                  and reliability (see section 5). In Semaphore there are rel-
Universities’ Week 1 provided a particularly successful oc-
                                                                                  atively few requirements for absolute and precise temporal
casion for about 60 members of the public of all ages to
                                                                                  coordination, although we are optimistic that more precise
interact with our system voluntarily. Although interactions
                                                                                  synchronisation can be achieved as the systems develop.
produced somewhat modernist music without clear melody
                                                                                    Performers were encouraged to provide informal feed-
or rhythm and although it is likely that only a relatively few
                                                                                  back throughout the rehearsal process and, as has happened
of the participants understood music notation it was clear
                                                                                  in the past, it was soon apparent that the main problems
that most enjoyed the experience immensely. Children in
                                                                                  emerged not from the generated music but rather how it
particular seemed able to relax and expressed themselves
                                                                                  was displayed.
    1 Universities’ Week 2014 provided research groups within UK univer-            A quick comparison of paper sizes and areas (table 1)
sities to showcase their research to the public. We were invited to demon-
strate the work behind Semaphore during the event held at the Natural
                                                                                  shows that the screen area of a 15” MacBook Pro is quite
History Museum in London in June 2014.                                            small - resolution is rather irrelevant as quite a large size
                                                                             55
of notation is needed. Traditional music paper sizes are far          music, both audio and notation:
from standardised, but tend to be quite significantly larger.
The laptop’s screen also only allows for the viewing of one               • “I really enjoyed the performance... it was interest-
‘page’ at a time and this small screen is in landscape mode                 ing to watch the dancers ’create’ the music.”
rather than portrait. All these factors mean that it is a very            • “I came because of a fondness for dance but ... there
different experience reading from a laptop’s screen rather                  is so much to take in here that it was useful to have to
than from pieces of paper.                                                  have two performances of the piece... Another cou-
  Another problem relating to screen size and presentation                  ple of renditions would have permitted me to take
is when ‘page turns’ should occur and in this new envi-                     in fully the choreography, the score, the text and the
ronment exactly what a page turn is. In paper parts page                    interaction of all the elements.”
turns, particularly those parts where frequent or near con-
stant playing is demanded, are planned carefully, maximis-                • “Thanks, it was beautiful”
ing the time available to turn the page at the most conve-                • “Very interesting, would attend another similar event”
nient moment. This also means that when a musician turns
the page they can consciously ‘discard’ previous informa-                 • “Really engaging and interesting... [the] performance
tion. Semaphore attempts a variety of experimental solu-                    was captivating”
tions, none of which are optimal as yet.
                                                                          • “It was great, and I wish more events had a dis-
  At the moment it is clear that the use of live notation
                                                                            cussion and then second performance format, that
requires compromise in how it is implemented and used.
                                                                            worked well”
For some composers these compromises may simply be
too radical to consider at present.                                       • “Brilliant!”
  Jonathan Eacott [50] suggests that there is a requirement
in live notation for ‘a metronome or cursor to keep musi-               Those who took part during the Universities’ week also
cians in sync’ and that there ‘must be a way of continu-              clearly demonstrated that people find generating music in
ally scrolling the music so that musicians can look ahead’            this way very enjoyable and rewarding. There would also
- these features would certainly be very useful. However,             appear to be a deep link between the domains of physical
they are not essential, depending on the nature of the ma-            movement and music. Semaphore shows that it is also pos-
terial presented. If the music appears note by note as it is          sible to create and manipulate translations between music,
being created this has the advantage that it can give a fairly        movement and text and that both performers and audience
clear indication of the ‘tempo’ at which it should be played,         find this expressive and stimulating. We very much hope
and any further synchronisation can be achieved between               to continue to develop these systems to enable expression
instrumentalists as usual: paper parts do have cursors or             and experimentation between domains. There are many
metronomes.                                                           possibilities that we have not even yet begun to explore.
  Apart from these issues, all involved with Semaphore and
                                                                      Acknowledgments
earlier pieces such as Calder’s Violin have been very posi-
tive about their experience with. Although some have dis-             The author would like to thank Anglia Ruskin University,
played confusion and anxiety on first acquaintance, after             Arts Council England and Turning Worlds Dance for their
some rehearsal and after realising that they are not required         support in this project. In addition I would particularly
or expected to play every note with perfect accuracy, they            thank the Jane Turner, Phil Terry, Kate Romano, Cheryl
relax and even enjoy the experience [23].                             Frances-Hoad, Ann Pidcock, Gwen Jones and Ciara Atkin-
                                                                      son for their support.
                   7. CONCLUSIONS
                                                                                           8. REFERENCES
All who have been involved in Semaphore have been grat-
                                                                      [1] D. Fober, Y. Orlarey, and S. Letz, “Inscore: An envi-
ified by the response received from audiences and work-
                                                                          ronment for the design of live music scores,” in Pro-
shop visitors. The audience were offered the chance of
                                                                          ceedings of New Interfaces for Musical Expression,
completing a general questionnaire; fourteen were com-
                                                                          Oslo, Norway, 2011.
pleted. These were uniformly positive; a number also con-
tained free text comments. Below are included a selection             [2] N. Didkovsky and G. Hajdu, “Maxscore: music no-
of these, included not in a spirit of self-congratulation, but            tation in max/msp,” in Proceedings of International
in order to demonstrate the connection felt between audi-                 Computer Music Conference, ICMA, Ed., SARC,
ence, the dancers’ physical movements and the resulting                   Belfast, 2008.
                                                                 56
 [3] A. Agostini and D. Ghisi, “Bach: an environment                [18] J. Weizenbaum, “Eliza–a computer program for the
     for computer-aided composition in Max,” in Proceed-                 study of natural language communication between man
     ings of the International Computer Music Conference,                and machine,” Communications of the ACM, vol. 9,
     Ljubljana, 2012, pp. 373–377.                                       no. 1, pp. 36–55, January 1966.
 [4] G. Hajdu, K. Niggemann, A. Siska, and A. Szigetvari,           [19] Motus Digital. Real-time animation. [Online].
     “Notation in the context of quintet.net projects,” Con-             Available: http://www.motusdigital.com/
     temporary Music Review, vol. 29, no. 1, pp. 39–53,                  real-time-animation.htm
     2010.
                                                                    [20] P. Molyneux. Meet Milo, the virtual boy.
 [5] C. Cardew, “Notation: Interpretation, etc.” Tempo,
                                                                    [21] H. Cole, Sounds and Signs.   London: Oxford Univer-
     vol. 58, no. Summer 1961, pp. 21–33, Summer 1961.
                                                                         sity Press, 1974.
 [6] (2012,       October).       [Online].     Available:
     http://theperformanceartinstitute.                    [22] I. Stravinsky, Conversations with Igor Stravinsky.
     org/2012/07/27/                                            Faber and Faber, 1959.
     october-19-2012-the-artist-is-elsewhere/              [23] R. Hoadley, “Calder’s violin: Real-time notation
 [7] (2012, January). [Online]. Available: http://              and performance through musically expressive algo-
     visualisecambridge.org/?p=132                              rithms,” in Proceedings of International Computer Mu-
                                                                sic Conference, ICMA, Ed., 2012, pp. 188–193.
 [8] C.      Ng.       Novel-writing        as     perfor-
     mance art. [Online]. Available:              http:    [24] K. C. Baird, “Real-time generation of music notation
     //fictionwritersreview.com/shoptalk/                       via audience interaction using python and gnu lily-
     novel-writing-as-performance-art/                          pond,” in Proceedings of New Interfaces for Musical
                                                                Expression, Vancouver, BC, Canada, 2005, pp. 240–
 [9] R. Sloan. Writing as real-time performance. [Online].      241.
     Available: http://snarkmarket.com/2009/
     3605                                                  [25] M. Laurson, M. Kuuskankare, and V. Norilo, “An
                                                                overview of pwgl, a visual programming environment
[10] E. James. Writing as performance art. [Online].            for music,” Computer Music Journal, vol. 33, no. 1, pp.
     Available:   http://www.novelr.com/2009/                   19–31, Spring 2009.
     10/10/writing-as-performance-art
                                                           [26] M. Edwards, “An introduction to slippery chicken,” in
[11] 3 day novel contest. [Online]. Available: http:            Proceedings of International Computer Music Confer-
     //www.3daynovel.com                                        ence, Ljubljana, 2012, pp. 349–356.
[12] Penguin. We tell stories. [Online]. Available: http:           [27] A. Agostini, E. Daubresse, and D. Ghisi, “Cage: a
     //www.wetellstories.co.uk                                           high-level library for real-time computer-aided com-
[13] P. Terry, Oulipoems 2.    Ontario, Canada: aha-dada                 position,” in Proceedings of the International Com-
     books, 2009.                                                        puter Music Conference, ICMA, Ed., ICMA. Athens,
                                                                         Greece: ICMA, September 2014, pp. 308–313.
[14] A. Turing, “Computing machinery and intelligence.”
     Mind, vol. 59, pp. 433–460, 1950.                              [28] H. Wulfson, G. Barrett, and M. Winter, “Automatic no-
                                                                         tation generators,” in Proceedings of New Interfaces
[15] A. Griffin, “Turing test breakthrough as super-
                                                                         for Musical Expression, New York, 2007.
     computer becomes first to convince us it’s human,”
     June 2014. [Online]. Available: http://goo.gl/                 [29] D. Kim-Boyle, “Real-time score generation for ex-
     7SNpi7                                                              tensible open forms,” Contemporary Music Review,
                                                                         vol. 29, no. 1, pp. 3–15, 2010.
[16] I. Sample and A. Hern, “Scientists dispute whether
     computer ‘Eugene Goostman’ passed Turing test,”                [30] M. M. Lischka and F. Hollerweger, “Lilypond: music
     June 2014. [Online]. Available: http://goo.gl/                      notation for everyone!” in Linux Audio Conference,
     wdQI1a                                                              2013.
[17] V. Veselov, E. Demchenko, and S. Ulasen. Eugene                [31] S. W. Lee and J. Freeman, “Real-time music notation
     Goostman. [Online]. Available: http://www.                          in mixed laptop-acoustic ensembles,” Computer Music
     princetonai.com                                                     Journal, vol. 37, no. 4, pp. 24–36, 2014.
                                                               57
[32] J.-B. Barriére, “Distant mirrors,” 2013.                      [46] A. Hunt and M. Wanderley, “Mapping performer
                                                                         parameters to synthesis engines,” Organised Sound,
[33] L. Foss, “The changing composer-performer relation-
                                                                         vol. 7, no. 2, pp. 97–108, 2002.
     ship: A monologue and a dialogue,” Perspectives of
     New Music, vol. 1, no. 2, pp. 45–53, Spring 1963.              [47] C. Salter, M. Baalman, and D. Moody-Grigsby, “Be-
                                                                         tween mapping, sonification and composition: Re-
[34] S. Barbacci, “Labanotation: a universal movement no-                sponsive audio environments in live performance,”
     tation language,” Journal of Science Communication,                 Computer Music Modeling and Retrieval. Sense of
     vol. 1, no. 1, 2002.                                                Sounds, pp. 246–262, 2009.
[35] A. Knust, “An introduction to kinetography laban (la-          [48] R. Godoy and M. Leman, Musical Gestures: Sound,
     banotation),” Journal of the International Folk Music               Movement and Meaning.   New York and London:
     Council, vol. 11, pp. 73–76, 1959.                                  Routledge, 2010.
[36] K. Sicchio, “Hacking choreography: Dance and live              [49] NIME, “New instruments for musical expression -
     coding,” Computer Music Journal, vol. 38, no. 1, pp.                http://www.nime.org/nime2014/,” April 2014. [On-
     31–39, Spring 2014.                                                 line]. Available: http://www.nime.org
[37] ——, “Coding choreography: Twists in language,                  [50] J. Eacott, “Flood tide see further: Sonification as mu-
     technology and movement,” Torque: Mind, Language                    sical performance,” in Proceedings of International
     and Technology, vol. 1, no. 1, pp. 145–151, 2014.                   Computer Music Conference, ICMA. ICMA, 2011.
                                                               58
                   THE DECIBEL SCOREPLAYER - A DIGITAL TOOL
                        FOR READING GRAPHIC NOTATION
                                                                          59
Lindsay Vickery and Stuart James, worked together to                laptop duet featured A graphic notation read from left to
develop a solution that would enable the presentation of            right. The image was put in motion in a movie program,
screen scores for Decibel to perform. The entire ensemble           and the performers read the score at the point just before
has been involved in a process of creation and                      it passed off the screen. This was not particularly accurate
interpretation of musical works in where new ideas and              but provided an approximation of coordination that
techniques are conceptualised, tested, evaluated, revised           facilitated the performance. The score had been created
and disseminated in performances, recordings and                    on a computer, and did not exist in any real “physical”
archiving [5]. Through this process, the group developed            dimension. In preparation for the first Decibel concert in
a system for reading scrolling scores that was prototyped           September 2009, Hope presented a score consisting of a
in MaxMSP. With the assistance of programmer (and                   computer print out of ten landscape A4 pages stuck
Decibel viola player) Aaron Wyatt, these systems                    together, a kind of coloured line graphic score for five
evolved into an iOS App, the Decibel ScorePlayer for the            instruments - one of which was a turntable - again with
Apple iPad. It is now available on the iTunes Store                 the problem of how to read the music in a coordinated
internationally.                                                    manner.
   Decibel are of course not the first to engage with
screen scores - previous work by Dannenberg [6], Clay
and Freeman [7], Kim-Boyle [8] and others have
examined the possibilities for real time score generation
on computers, and a variety of propriety score generators
                                                                    Figure 1. Cat Hope’s score In The Cut (2009).
for traditional notation are available, two examples being
INscore [9] and MaxScore [10]. However the use of                      This piece was In The Cut (2009) for violin, cello, bass
graphic notation - newly composed and extant - in screen            clarinet, bass guitar and turntable with sub woofer and is
scores has been limited, and often tied to traditional              shown in Figure 1. The piece does not treat harmony or
notation. The digital format offers a range of possibilities        meter in any ‘traditional’ way, adopting graphic notation
to develop graphic notation practice - through the                  as a way to better reflect a proportional approach to
incorporation of aspects such as colour, real time                  music composition [12].
generation, video and interactivity. Decibel’s score player            A solution to the problem of reading In the Cut was
investigations have focused primarily on this area of               provided through the creation of a MaxMSP patch, where
development, and in providing a ‘reading mechanism’ for             the digitally created score file (a JPEG or PNG) was read
performance, rather than a score generation tool.                   by passing under a vertical line over a pre prescribed
                                                                    period of time, in the case of In The Cut, seven and a half
THE DEVELOPMENT OF A SCROLLING SCORE                                minutes, as shown in Figure 2. A control panel was built
              PLAYER                                                to adjust specifications for each performance, and was
The iTunes store describes the Decibel ScorePlayer as               shown on the same screen as the score.
software that “allows for network-synchronised scrolling
of proportional colour music scores on multiple iPads.
This is designed to facilitate the reading of scores
featuring predominantly graphic notation in rehearsal and
performance” [11]. It works best for music that needs to
be coordinated in a “timed” way, with proportional pitch
structures. It is particularly useful for music that is
pulseless, or requires pulse to be removed from the
reading mechanism. The Decibel ScorePlayer is very
good at presenting scores that in the past would have
required a clock to coordinate multiple performers.
   The Decibel ScorePlayer began as a bespoke solution
to the problem of reading certain graphic scores,
specifically those by author Cat Hope, who is a composer
and ensemble director of Decibel. In 2008, before                   Figure 2. Lindsay Vickery’s control panel for the score player built in
Decibel had began, Hope’s Kingdom Come (2008) for                   Max MSP.
                                                               60
This vertical line came to be known as the playhead,                         kits, in this case the bass drum, cymbals and toms. The
referencing the tape head on tape players. Musicians                         scrolling nature of these scores effectively communicate
would play their part as it passed by the playhead,                          the composer’s intention a kind of pulseless music
providing an accurate way of coordinating the performers                     characterized by long sustained sounds. They also allow
together by reading the same part in the score at the same                   careful ensemble interactions enabling an accurate
time. The playhead was placed slightly in from the left                      reading of the proportional nature of the score.
side of the score image, so that the performers could see
the material approaching the playhead in advance, but                                   READING AND NETWORKING
also so a small amount of material already performed,                        The first Decibel scrolling scores were projected onto a
which would often assist in referencing the upcoming                         screen in the performance space, to facilitate musicians
material. The coloured parts provided easy identification                    reading the score in performance. Whilst providing a
for the different performers, and the piece itself was                       straightforward solution to coordinating a performance,
proportional in its representation of pitch across all the                   the performers mostly had their backs to the audience,
instruments. The score presents each instruments part as a                   hardly a desirable performance presentation format. The
long, slowly descending line, representing a very smooth                     score was also a very predominant feature in the space.
sound quality that uses glissandi to move between                            Many audience members would comment on the nature
different pitches. Simply, the score looks very much as it                   of the score and follow it intently during the performance.
sounds, and this is supported by a number of audio                           Whilst this brought a new audience to our concerts
spectrograms made of different performances, such as the                     seeking to ‘understand’ the practice of new music, it had
example provided in Figure 3.                                                become more of a focus than the music itself. To
                                                                             overcome this, Decibel member Stuart James added
                                                                             networking capacity, so that multiple laptop computers
                                                                             could be connected and coordinated over cabled Ethernet.
                                                                             This meant that each performer had their own score
                                                                             player coordinated with the others in the ensemble. The
                                                                             patch was further developed by Vickery to fast-forward
                                                                             to different parts of a score, and to slow the speed of the
Figure 3. Spectrogram of a performance of Cat Hope’s score In The Cut        piece for rehearsal purposes.
(2009) [13].                                                                    These developments made the software more workable
   Vickery built the MaxMSP patch in consultation with                       in rehearsal situations, and some fifteen works were
Hope and ensemble. It usually required the performers to                     composed for this version of the player. The ensemble
have access to a full version of MaxMSP to run the                           also began adapting a range of other composer’s scores to
program, though it was later made workable on Max                            be read by the ensemble using the patch, including Earl
Runtime. A number of works were written for this                             Brown’s December 1952 for open instrumentation and
software player prototype, some for other ensembles, and                     Giacinto Scelsi’s Aitsi (1974) for piano and electronics
some without electronics. One example is Hope’s                              among others. Works from Percy Grainger’s Free Music
Kuklinski’s Dream (2010) for instrumental trio, carving                      project, namely his Free Music No. 1 (1936) for four
knives and electronics. Like In The Cut, the work is                         Theremins and Free Music No. 2 (1937) for six
characterised by a lack of pulse, proportional pitch                         Theremins were put into the player. The pages of
relationships, colour representations for different                          Grainger’s hand drawn score were joined together and
instruments and unusual instruments (in particular,                          scanned into a single file, the different parts traced over
carving knives bowed and amplified). A notated                               in different colours and a playhead designed to include
electronic part was also featured, required programming                      the list of pitches represented by the undulating lines that
by the ensemble’s electronics operator prior to                              are a feature of this composition, as shown in Figure 4
performance. Another work by Hope, Wolf at Harp                              [14].
(2011) for four drum kits, used blocks of notation to
describe fields of activity on certain parts of percussion
                                                                        61
                                                                                 In Ghosts of Departed Quantities, each performer has
                                                                              unique score activity, unlike Hope’s scores, which
                                                                              required a tightly coordinated presentation of fixed
                                                                              materials. Vickery’s screen scores presented materials
                                                                              that would arrive in a different order and quantity each
                                                                              time the piece was performed. Scores such as In the Cut
                                                                              provide performers with the possibility of choosing
                                                                              different starting notes for each performance, but require
                                                                              them to maintain the same pitch relationships each time.
                                                                                 The score player patch continued to be adjusted and
                                                                              developed to incorporate a range of new behaviors,
                                                                              including changes in the direction of the score. Hope’s
                                                                              Liminum (2010) features a score that musical material
                                                                              goes backwards and forwards, and the play head jumps to
Figure 4. Percy Grainger Free Music No. 1 (1931) adapted for the iPad
                                                                              different parts in the score at certain points. Again, each
Decibel ScorePlayer. This image shows the playhead replaced by a
chromatic meter, and the scrub function along the bottom of the image,        player’s score is independent in this process, whilst being
with the time elapsed on the right.                                           coordinated to start and finish together. In Juanita
                                                                              Neilsen (2012) these ‘jumps’ are coordinated to occur in
   Other screen scores were being developed within the
                                                                              random places, but coordinated with all players. These
ensemble that included variations on the theme of
                                                                              scores have been categorized as ‘Variable Scrolling
scrolling presentation. Vickery’s Ghosts of Departed
                                                                              Scores’. In a collaborative work between Hope and
Quantities (2011) for bass flute, bass clarinet, cello,
                                                                              Vickery, Talking Board (2011), circles traverse a larger
keyboard and live electronics, for example, features
                                                                              than the screen image, serving as the guide for musicians
music notation that subtly appears and disappears to the
                                                                              to read said image, as shown in Figure 6. The
reader as it passes a playhead. Figure 5 shows the
                                                                              movements of the circles provide information to an
presentation of two instrumental parts, bass flute and bass
                                                                              electronics operator for generative, interactive and
clarinet. The musical information passes from left tor
                                                                              spatialised electronic parts. Talking Board was a radical
right across the playhead.
                                                                              departure from the scrolling score format used on the
                                                                              score player up until that point, completely breaking
                                                                              away from the linear, left to right presentation and
                                                                              reading of the score. The circles have a series of different
                                                                              behaviors, including swarming, following, getting larger
                                                                              and smaller, appearing and disappearing [15]. It also
                                                                              required the transmission of data generated by
                                                                              movements on the score to another sound generating
                                                                              computer, signaling the need for the score player to send
                                                                              more than score data, leading to investigations around the
                                                                              incorporation of Open Sound Control (OSC).
                                                                         62
                                                                                    score produces sine tones as a result of the generative
                                                                                    activity in the patch producing the score [8]. Between
                                                                                    2010 and 2012, a number of pieces were written for the
                                                                                    scrolling score player by a range of composers, often
                                                                                    characterised by the inclusion of non traditional
                                                                                    instruments, that would otherwise be difficult to notate
                                                                                    using conventional notations.
                                                                               63
                                                                                instructions required for each individual score, as in
                                                                                Figure 9.
                                                                                   A User Guide is provided on the App to explain how it
                                                                                works, how to set up network, and how to create your
                                                                                own scores for the App. This includes a contact email for
                                                                                any enquiries or bug fix suggestions to be made, and
                                                                                point the user to a web site where instructional videos are
                                                                                provided [17]. On the iPad ScorePlayer, you can choose
                                                                                to see the score as a whole, or as individual parts. This
                                                                                function was first used on Hope’s piece Juanita Nielsen
                                                                                for two violas, two cellos, piano, electric guitar and
                                                                                electronics, at the premiere performance of the Decibel
                                                                                ScorePlayer in September 2012 at the Perth Institute of
                                                                                Contemporary Arts. It became evident in rehearsals of
Figure 7. The Score Creator interface built by Aaron Wyatt and
designed by Decibel composers in conjunction with him.                          Juanita Nielsen that the complex nature of the diagrams
                                                                                in the piece required magnification to be read accurately,
The iPad Decibel ScorePlayer provided a number of                               and so the idea of providing separate parts was born.
benefits over the laptop version. A much easier                                 These can be added in the score creator in addition to a
networking facility, native to iOS meant each iPad user                         master score. The parts are coordinated with each other,
could join any network agreed on by the ensemble, and                           even when you use the finger drag up and down on the
users could see who else was on the network at any time                         screen to change between different parts.
using a network tab [16]. Once .dsz files are created,
users can add scores to the Player by uploading them in
the sharing facility of iTunes, as seen in Figure 8.
                                                                                Figure 9. The ‘User Guide’ pop up, as seen over the list of works in the
                                                                                player (screen shot).
                                                                           64
                                                                                     or group of instruments needs to be referred to spatially
                                                                                     in the score. The shift can be done by simply locking the
                                                                                     rotation on the iPad and turning it to a portrait, instead of
                                                                                     landscape, view, so the score flows upwards, rather than
                                                                                     from left to right. The Hope’s piece Broken Approach
                                                                                     (2014) for solo percussionist is read across a horizontal
                                                                                     playhead, reflecting the spatial arrangement of the
                                                                                     different percussion instruments in the performers set up,
                                                                                     and is seen in Figure 11. Likewise, Hope’s piano works
                                                                                     Chunk (2010) and Fourth Estate (2014) use the playhead
                                                                                     to reflect the horizontal presentation of the piano
                                                                                     keyboard to the performer, the latter providing a shuffling
                                                                                     mechanism that presents the composition differently each
                                                                                     time, with eight different score images joining seamlessly
                                                                                     in a different order each time the piece is opened on the
                                                                                     ScorePlayer, using a ‘tiling’ approach for the different
                                                                                     images. These scores have been named ‘vertical scrolling
                                                                                     scores’.
Figure 10. Hope’s Juanita Nielsen. The top image shows the full score
in the player. The lower image shows one part - in the same point of the
piece, visible. The playhead is in the middle of the screen as the score
goes in different directions. I red light in the top right flashes twice as a
warning that the direction is about to change.
                                                                                65
                                                                            Figure 14. Hope’s Miss Fortune X score excerpt, (screen shot) showing
Figure 12. James Rushford’s Espalier in the Decibel ScorePlayer             the first issue Decibel ScorePlayer’s welcome screen for the piece. This
(screen shot). Note the times on the top of the score - rendered            information was later replaced with an information dropdown tab. Note
superfluous by the ScorePlayer.                                             the copy ‘noise’ on the right hand side of the image.
                                                                       66
IV, V and VI and packaging them with the remaining two                        device of the playhead can be used to create readable
Variations into the John Cage Variations App, in                              scores for different kinds of composition.
consultation with Cage’s publishers, Peters Edition, and
the John Cage Foundation in New York. Scheduled for
release in conjunction with the groups recordings of the
eight Variations on US label MODE in 2015, the App
takes aspects of the Decibel ScorePlayer and applies them
to the Variations, creating graphic scores by following
and automating Cage’s detailed processes. The result is
very accurate and easy to read notations for each of the
Variations, an example of which can be found in Figure
16. This example shows the graphic representation
selected by Decibel of the data generated according to
Cage’s specifications around the placement of dots, lines
and other shapes. 1 It also shows the similarity of the
presentation on the iPad to the Decibel Score Player.
Figure 16. John Cage Variation 1 score excerpt (screen shot) showing          Figure 17. Amanda Stewart’s Viceversa (excerpt screen shot). The top
the graphic representation that scrolls in the Decibel ‘The Complete          image shows the score part (a different colour for each performer. The
John Cage Variations’ ScorePlayer.                                            lower image shows the ‘scrubbed out’ text for instruments to play. The
                                                                              image goes left to right, and right to left in the player.
Australian sound poet Amanda Stewart’s Vice Versa
(2001) is a one-page text for live performances. Decibel                      There are ongoing updates and bug fixes to the Decibel
adapted the work as a variable scrolling score by                             ScorePlayer, but the most recent developments have
typesetting the text in the score player, facilitating                        included the ability to create score files that embed a full
reading from different directions, at different times. A                      quality audio track into the .dsz format, opening the
range of differently coloured parts are provided, and                         possibilities for a huge range of works for instrument and
occasionally text would appear scrubbed over, leaving the                     tape that could be adapted for the Decibel ScorePlayer.
instruments to play the resulting shapes. Figure 17 shows                     Vickery created a score player for his 2009 performance
the original score in the player, beside and a screen shot                    of Denis Smalley’s piece Clarinet Threads (1985) for
of how scrubbed over version. Experiments such as this                        clarinet and tape that enabled the score to be read
one highlight the number of ways the simple reading                           accurately alongside playback [22]. Hope’s Signal
                                                                              Directorate (2014) for bass instrument/s and prerecorded
                                                                              sounds, prototyped in MaxMSP by Vickery, is the first
1                                                                             piece to use the iPad ScorePlayer to deliver the score
  A more detailed discussion of the implentatoin and the other Cage
Variations can be found in a paper in the 2013 Malaysian Music Journal
                                                                              synchronized with audio playback from within the iPad,
[19] and papers by Lindsay Vickery [20] and Cat Hope [21].                    and contained within the .dsz file. The Score Creator will
                                                                         67
be updated to enable the most recent facilities enabled by                                        CONCLUSIONS
the player. The next release will feature OSC                                    Without any marketing support other than a few
compatibility and extra options for the Talking Board                         Facebook posts to the DecibelNewMusic page, and
circle reading paradigm, allowing users to insert their                       showcasing though tours, the Decibel ScorePlayer has
own image and select the number of circles required for a                     sold 140 copies to date at AUD$2.99, not including the
performance, as shown in Figure 18. OSC will enable the                       free copies the Decibel composers can access for the
data required to drive the electronics in this piece to be                    performances of their works. A visit to Malaysia by
sent to another computer running the audio manipulation                       Decibel performing the ‘John Cage Variations Project’
software.                                                                     using the bespoke application brought into sharp focus
                                                                              the need to make an Android version of the application,
                                                                              as Android appears to dominate the tablet computer
                                                                              market in large areas of Asia. However, funding for this
                                                                              development is yet to be found.
                                                                                 The potential for the Decibel ScorePlayer is
                                                                              substantial. There has been a recent resurgence of
                                                                              interest in graphic notation with some detailed
                                                                              examinations of practice [24] [25] [26] and an awareness
                                                                              of animated notations disseminated by online services
                                                                              such as YouTube and Vimeo. Yet it is quire remarkable
                                                                              how few of these developments engage with the full
                                                                              potential of digital representation. Further negotiations
                                                                              with publishers could result in a number of approaches
                                                                              for digital publication of extant works, and currently any
                                                                              composer can put their work in the ScorePlayer and
                                                                              publish it.
                                                                                 Research into the impact of reading different kinds of
                                                                              screen scores has recently commenced. Using eye-
                                                                              tracking equipment, Vickery has been comparing
                                                                              traditional paper notations and the different kinds of score
                                                                              formats developed in Decibel [27], leading to detailed
                                                                              examinations of the way readers process colour and
                                                                              movement in music notation.
                                                                                 The Decibel ScorePlayer embraces the possibilities of
                                                                              colour and graphic notations in digital score reproduction,
Figure 18. The ‘circle selector’ for The Talking Board, available when
                                                                              as well as the interactive possibilities inherent in digital
pressing the options tab.
                                                                              score creation and composition. Whilst currently a
    In 2012, the first survey of Australian graphic music                     relatively simple device, the possibilities for its
notation was curated by Cat Hope in two Australian                            development are considerable. It does not claim to solve
cities, and featured a number of the scores for the                           problems for all types of graphic notation, but makes
scrolling score player presented as movies on a screen in                     certain types more efficient to read. Screen scores are in
a gallery [23]. These movie representations of scrolling                      their infancy, and the way we understand colour and
scores are a fixed alternative for the reading of the scores,                 shape as musical information, as well as our ability to
when a single projection is desirable. Synchronised with a                    process moving information on computer screens requires
live performance, they can also provide useful                                further investigation [28]. The Decibel ScorePlayer
illustrations to how the works may be performed.                              represents the potential of group projects where
However, in for larger ensembles or more complex parts,                       composers, musicians, programmers and music curators
it is sometimes difficult to see the required level of detail                 can work together to extend the possibilities of available
and no variation of speed is easily possible.                                 technologies.
                                                                         68
Acknowledgments                                                       ibel-scoreplayer/id622591851?mt=8
Decibel new music ensemble consists of Cat Hope                  [12] C. Hope and L. Vickery, “Visualising the score:
(artistic director, flutes, composer), Lindsay Vickery                Screening scores in real-time performance.” IME
(composer, reeds, programmer), Stuart James (composer,                Journal, Murdoch University, 2012.
piano, drum set, electronics, networking, programming),          [13] C. Hope, A. Wyatt, l. Vickery, “Reading Free
Aaron Wyatt (viola, violin and iOS programming),                      Music” Australasian Musicological Jounal, 2015, in
Tristen Parr (Cello, testing), Louise Devenish                        review.
(percussion, testing). Lindsay Vickery created the first         [14] L. Vickery, "The Evolution of Notational Inno-
score player prototype. Stuart James built the Network                vations from the Mobile Score to the Screen Score,"
Utility and lead the team for the Decibel ‘Complete John              Organised Sound. Vol 17 No. 2, 2012, p. 130.
Cage Variations’ ScorePlayer. Aaron Wyatt is the
                                                                 [15] C. Hope, L. Vickery, “Screen Scores: New Media
programmer the iOS iPad Decibel ScorePlayer. The
                                                                      Music Manuscripts”, International Computer Music
Decibel ScorePlayer project, and the Complete John Cage
                                                                      Conference. Monty Adkins, Ben Isaacs. Hudders-
Variations Project were funded with assistance from
                                                                      field, UK. The International Computer Music
Edith Cowan University.
                                                                      Association, 2011, p. 224-230.
                    REFERENCES                                   [16] A. Wyatt, C. Hope, L. Vickery, S. James, “Animated
                                                                      Music Notation on the iPad (Or: Music stands just
[1] C. Hope (Ed), Audible Designs, PICA Press, 2011.
                                                                      weren't designed to support laptops)”. Proceedings
    p. 6
                                                                      of the International Computer Conference, Perth,
[2] Decibel (n.d.). Decibel CV                                        WA, 2013, p. 201- 207.
    http://www.decibelnewmusic.com/                              [17] The Decibel Score Player
    (accessed 24 Jan, 2015).
                                                                      http://www.decibelnewmusic.com/deci
[3] C. Hope (Ed), Audible Designs, PICA Press, 2011.                  bel-scoreplayer.html
    p. 7
                                                                 [18] Test Flight,
[4] L. Vickery, “Screening the Score” in C. Hope, (ed)                https://www.testflightapp.com
    Audible Designs, PICA Press, 2011, p. 86.
                                                                 [19] C. Hope,      L. Vickery,    A. Wyatt,   S. James,
[5] H. Smith and R.T. Dean, Practice-led research,
                                                                      “Mobilising John Cage: The Design and Generation
    research-led practice in the creative arts, Edinburgh
                                                                      of Score Creators for the Complete John Cage
    University Press, 2009, p. 56.
                                                                      Variations I - VIII”. Malaysian Music Journal, vol.
[6] R. B. Dannenberg, “Music representation issues,                   2 no. 1, 2013, p. 34-45.
    techniques and systems”, Computer Music Journal,
                                                                 [20] L. Vickery, C. Hope, S. James, “Digital adaptions of
    Vol 17, No.3, 1993, p 20–30.
                                                                      the scores for Cage Variations I, II and III”.
[7] A. Clay, & J. Freeman, J. “Preface: Virtual Scores                International     Computer   Music      Conference.
    and Real-Time Playing”, Contemporary Music                        Ljubljana. International Computer Music Associa-
    Review, Vol 29 No. 1, 2010, p. 1.                                 tion, 2012, p. 426-432.
[8] D. Kim-Boyle, “Real-time Score Generation for                [21] C. Hope, S. James & L. Vickery, “New digital
    Extensible Open Forms.”, Contemporary Music                       interactions with John Cage's Variations IV, V and
    Review, Vol 29, No. 1, 2010, p. 3-15.                             VI.”, Proceedings of the 2012 Australasian
[9] D. Fober, Y. Orlarey, S. Letz, “INscore: an Environ-              Computer Music Conference. Griffith University,
    ment for the Design of Live music scores”. From                   Brisbane, Australia. Australasian Computer Music
    http://www.grame.fr/ressources/publ                               Association, 2012, p. 23-30.
    ications/INScore-ID12-2.pdf                                  [22] L. Vickery, “Mobile Scores and Click Tracks:
[10] Maxscore for MAX/msp and Abelton Live.                           Teaching Old Dogs New Tricks”. The Proceedings
    http://www.computermusicnotation.com                              of the Australasian Computer Music Conference,
[11] The Decibel Score Player                                         Australian National Unversity, 2010.
                                                            69
[24] F. Feaster, Pictures of sound: One thousand years of            tional Continuum,” in Proceedings of the Inter-
     educed audio: 980-1980. Dust to Digital, 2012.                  national Computer Conference, Athens, Greece,
[25] R. Johnson (Ed.), (1981). Scores: An anthology of               2014 in press.
     new music. Schirmer, 1981.                                  [28] L. Vickery, “The Limitations of Representing Sound
[26] T. Suaer, Notations 21, Mark Batty, 2009.                        and Notation on Screen,” Organised Sound, Vol 19,
                                                                      2014 p. 226.
[27] L. Vickery, “Exploring a Visual/Sonic Representa-
                                                            70
  SPECTROMORPHOLOGICAL NOTATION: EXPLORING THE
 USES OF TIMBRAL VISUALIZATION IN ETHNOMUSICOLOGI-
                     CAL WORKS
                          Mohd Hassan Abdullah                                           Andrew Blackburn
                      Sultan Idris Education University                            Sultan Idris Education University
                           [email protected]                                        [email protected]
                                                                        71
    All the kompangs used in an ensemble is tuned to a            will then be applied into forefront of music making
certain pitch as closest possible from one to another.            using these possible model and system.
Even though there is no standard tuning set for the
kompang, but an experienced kompang player is able to                           RESEARCH QUESTIONS
tell the “acceptable sound” of a kompang. The “ac-                In exploring the possibilities of using the spectrographic
ceptable sound” of a kompang to the players is de-                features in ethnomusicological study, there are many
scribed as (kuat) loud, (gemersik) penetrating, (tajam)           related questions can be addressed.
sharp and (tegang) taut. How can one precisely under-
                                                                      i.       How can an ethnomusicologist describe
stand and perceive the sound of a kompang as loud,
                                                                               the sound of a musical instrument ?
penetrating, sharp and taut ? Can one precisely describe
the sharpness sound of the kompang ? As the sound of
                                                                      ii.      What are the elements that ethnomusicol-
                                                                               ogists require from a notation system and
any indigenous musical instruments are mostly not
                                                                               how can these be represented ?
standardize in nature, there is a need to find ways on
how to identify and recognize the “acceptable sound” of               iii.     What kind of notational/transcription sys-
any particular musical instruments especially for the                          tem can possibly describe precisely the
beginners and who are not expert in that field.                                musical sound of traditional instrument ?
    Moreover, contemporary Western arts and tradition-                iv.      What organological elements are common
al music notation is usually linked to an analysis and the                     or exclusive to each instrument and how
semiotic representation of the musical elements of mel-                        can they best be identified and analyzed ?
ody and harmony (vertical and horizontal pitches) using               v.       Can spectrographic analysis and software
common music notation. Precise pitch indications are                           be used to provide a method for defining
“rounded out” into the twelve semitones of this system,                        and identifying unique qualities of Malay
unable to accommodate more the precise subtleties of                           traditional Instruments ?
sound that are inherent in all music tradition. Further,              vi.      Can this information be used to describe
Musical parameters such as articulation (attack, decay,                        and notate the specific individuality of
sustain and release) and dynamics (volume or intensity)                        sounds materials and performance meth-
are loosely indicated through the use of staccato or                           ods in ways that expand the range and mu-
phrase markings for articulations or dynamic marks                             sical vocabulary of the ethnomusicolo-
(forte, piano, crescendo, diminuendo, ect.).                                   gist ?
    Representation of other significant musical elements              vii.     What parameters of analysis can be de-
such as tone and colour (timbre) are largely limited to                        fined to provide useful and universally
instrumental naming or specific performance directions                         understood symbols using spectrographic
(sul ponticello – play near the bridge for string instru-                      software ?
ments). The lack, along with the difficulties of defini-
                                                                      viii.    How can this notational system help
tion and understanding of timbre are increasingly rec-
                                                                               scholars, musicians, instrument makers
ognized within both new music and traditional music
                                                                               and others in identifying a prefer timbre
fields.
                                                                               for any particular Malay traditional in-
                                                                               strument ?
                AIMS OF RESEARCH
                                                                      ix.      What other knowledge can be drawn from
This project will explore the creation of a model for the                      this ?
timbral and performance notation of acoustic music that
notates more content details of the various elements of                             METHODOLOGY
sound. Of significance for ethnomusicologists who
                                                                  In conducting this study, various methods will be uti-
working in this field, will be the use of spectrographic
                                                                  lized in getting the useful data and information to an-
notation leading to the creation of an authentic and
                                                                  swer the research question. Generally, methods will be
precise transcription library and catalogue inclusive of
                                                                  grounded in practice. While exploring all the possibili-
all musical elements. Such a catalogue will lead to a
                                                                  ties of using spectrographic as a tool to describe the
greater understanding of the individual and unique
                                                                  characteristic of a sound, researchers will analyze and
spectral and tuning characteristics of traditional Malay
                                                                  think through practice. This method is also always
musical instruments. This method will be applied to
                                                                  referred as practice-led research. Three phases will
instruments such as kompang, gedombak, gendang,
                                                                  cumulatively document, analyze, apply and reflect on
serunai, and rebab. Knowledge and experience of creat-
                                                                  project activities and outcomes. Critical reflection is a
ing spectrograms of the Malay traditional instruments
                                                             72
key criterion of the research, supported by textual ana-            After receiving clarification from the expert player,
lysis.                                                           the 4th beat of the sound is the most preferred sound by
    Research activities include identifying the sound            the expert player. One can analyze from the colours and
characteristic of a few selected Malay traditional musi-         density of the spectrogram to tell the characteristic of
cal instruments such as gedombak (goblet drum), gen-             the preferred sound.
dang (cylindrical drum), kompang (frame drum), serun-               Different filters have been applied to the one record-
ai (double-reed oboe type instrument), and rebab                 ing of the gedombak. The results show different fea-
(spike-fiddle). Each of them will be performed by the            tures of the sound performed on the same instrument.
expert players for the recording purpose. A few soft-            Below are the example of different spectrograms show
ware packages will be utilized to visualize the sound            different features and characteristic of a sound per-
characteristic of each instrument. From the spectro-             formed on Malay traditional instrument.
grams, the researchers will then think on how it can be
applied in ethnomusicological works.
                    THE RESULT
Many samples of Malay traditional instrument sound
have been recorded in the form of wave file. The in-
struments include the gedombak, gendang, serunai,
geduk and gong have been performed by the expert
players both solo and ensemble for the recording pur-
pose. Three software packages – Eanalyse, Sonic Visu-
aliser and Praat- have been utilized to visualize the
recorded clips.
                                                                 Figure 1. Spectrogram of a gedombak
                       Digital record-
                    ing
Sound clipping
                       Feature extrac-
                    tion
Spectrogram
                                                            73
                                                                   estimation of spectrogram and autocorrelation reflects
                                                                   more effectively the difference in musical instrument.
                                                                   Acknowledgements
                                                                   We would like to thanks the Ministry of Education,
Figure 4. Zoom in Spectrogram of a Gedombak                        Malaysia for awarding the research grant under the
                                                                   Fundamental Research Grant Scheme. We also wish to
                                                                   thanks the Sultan Idris Education University through
                                                                   their Research Management and Innovation Centre who
                                                                   coordinating all the research activities. We also would
                                                                   like to express our gratitude to the dean and all man-
                                                                   agement team in the Faculty of Music and Performing
                                                                   Arts at the same university for providing all the facili-
                                                                   ties used in this research. Lastly, we thanks to all the
                                                                   people who either directly or indirectly involve in this
Figure 5. Spectrogram of a Wayang Kulit ensemble                   research. Without that, this study would not have been
                                                                   possible.
          DISCUSSION AND SUGGESTION
The spectrograms of gedombak (goblet drum) in                                        BIBLIOGRAPHY
Wayang Kulit ensemble (Shadow puppet play) above                       [1] K. D. Martin : Sound-Source Recognition: A
are an initial attempt to explore the potential of a spec-                 Theory and Computational Model, Ph. D. the-
trogram as a performative notation. The gedombak is                        sis, MIT, 1999.
indicated as the large regularly spaced columns of Fig-
                                                                       [2] A. Livshin, X. Rodet : “Musical Instrument
ure 4. In Figure 5, the horizontal lines represent the
                                                                           Identification in Continuous Recordings”,
melodic lines of the serunai. The pitch variations and
                                                                           Proc. of the 7th Int. Conference on Digital Au-
arabesque ornamentation so characteristic of the in-
                                                                           dio Effects (DAFX-04), Naples, Italy, October
strument is also visible. This begins to plan to use spec-
                                                                           5-8, 2004
trograms of individual instruments to identify preferred
timbral quality of instruments for use in specific musi-               [3] A. Eronen, A. Klapuri: “Musical Instrument
cal/dramatic contexts–why a Wayang Kulit ‘master’                          Recognition Using Cepstral Coefficients and
selects one instrument over another in a given perfor-                     Temporal Features”, Proc. of the IEEE Inter-
mance ?                                                                    national Conference on Acoustics, Speech and
    Just what is timbral notation - gestural, purely tonal,                Signal Processing, ICASSP 2000, pp. 753-756
semiotic etc. ? This opens the potential for different
                                                                       [4] T. Kitahara, M. Goto, H. Okuno: “Musical In-
forms and styles. In the ethnomusicological context -
                                                                           strument Identification Based on F0-
instrumental profiling of timbre, linked to the organolo-
                                                                           Dependent Multivariate Normal Distribution”,
gy of the instrument is both applicable in Malaysia and
                                                                           Proc. of the 2003 IEEE Int'l Conf. on Acoustic,
opens ideas that appear to inform ideas and practices in
                                                                           Speech and Signal Processing (ICASSP '03),
the other sub-projects of the overall research project.
                                                                           Vol.V, pp. 421-424, Apr. 2003
                      CONCLUSION                                       [5] A. Eronen: “Musical instrument recognition
In this paper, we dealt with recognition of sound sam-                     using ICA-based transform of features and dis-
ples and presented several methods to improve recogni-                     criminatively trained HMMs”, Proc. of the
tion results. Tones are extracted from a database of                       Seventh International Symposium on Signal
                                                                           Processing and its Applications, ISSPA 2003,
Malaysian traditional musical instruments (gedombak,
gendang, serunai, etc.). We use two different parameters                   Paris, France, 1-4 July 2003, pp. 133-136
in the analysis. From the experiments, we could observe                [6] G. De Poli, P. Prandoni: “Sonological Models
evident results for spectrogram and autocorrelation.                       for Timbre Characterization”, Journal of New
Maximum and minimum values of amplitude for auto-                          Music Research, Vol 26(1997), pp. 170-197,
correlation for all musical instruments have different                     1997
ranges. Spectrogram of gedombak is much larger than
those of gendang and serunai. Result shows that the
                                                              74
         DENM (DYNAMIC ENVIRONMENTAL NOTATION FOR MUSIC):
        INTRODUCING A PERFORMANCE-CENTRIC MUSIC NOTATION
                            INTERFACE
                                                                James Bean
                                                     University of California, San Diego
                                                           [email protected]
                                                                           75
Javascript language to control the graphical output of Adobe
Illustrator. This first phase of development served as re-
search into the systematic organization of musical scores
and the programmatic drawing of musical symbols. The vi-                        Figure 1. Design of clefs: treble, bass, alto, tenor.
sion of this project necessitates animated graphical content,
which ultimately required the rewriting of all source code.             2.1.1 Clef Design
There are certain features that were prioritized in the initial
phase of development 1 that will ultimately be rewritten in             Traditional clefs take up a considerable amount of horizon-
an animated context.                                                    tal space. The width of traditional clefs is problematic for
                                                                        the spacing of music, particularly when the preservation of
                                                                        proportionate music spacing is a high priority. The minimal-
                                                                        ist clefs in Fig. 1 take up very little horizontal space. Clefs
         2. GRAPHICAL USER INTERFACE                                    are colored specifically to enable a differentiation of the clef
                                                                        from the surrounding context and subtle breaks are made
More and more performers are reading music from tablet                  in the the staff lines to accentuate the clefs’ presence. Staff
computers. Software applications like ForScore display a                lines are gray, rather than black, enabling the creation of a
PDF representation of a musical score, allowing a performer             foreground / background relationship between musical in-
to turn pages with a Bluetooth footpedal, as well as to                 formation carrying objects (notes, accidentals, articulations,
annotate scores with handwritten or typed cues. Performers              etc.) and their parent graph.
are able to store many scores on a single device, simplifying           2.1.2 Accidental Design
the logistics of performing many pieces. Because the PDF
contains no reference between graphical musical symbols
and their musical functions, the degree to which a player
is able to interact with this medium in a musical context is
limited.
                                                                                          Figure 2. Design of accidentals.
  Many of the cues that performers handwrite in their parts
are simplified versions of other players’ parts [20]. These               Accidentals, as can be seen in Fig. 2, are drawn program-
types of cues are being reentered by the performer, even                matically, as opposed to being instances of glyphs from a
though this information is already retrievable from the data            font. The advantage to uniquely drawing each accidental is
that is being graphically represented by the score. The                 that small vertical adjustments can be made to individual
primary objective of denm is to expose the structures un-               components of the object (e.g. body, column(s), arrow) in
derlying the music to performers with little cost of access.            order to avoid collisions in a more dynamic fashion than
                                                                        is usually implemented in other music notation software 2 .
                                                                        Burnson’s work with collision detection of musical sym-
                                                                        bols [22] serves as an example for the similar work to be
2.1 Graphic Design Priorities
                                                                        approached in continued development.
The graphic design style of denm is minimalist, with as                 2.1.3 Rhythm Design
few graphical ornaments as possible. Rather, variations in              In cases of embedded tuplets, beams are colored by the
color, opacity, line-thickness, and other graphical attributes          events’ depth in the metrical hierarchy. Ligatures, as seen
are used to differentiate an object from its environment. In            in Fig. 3, connect tuplet brackets to their events to clarify
some cases, the variations in graphical attributes serve to             jumps in depth.
differentiate an object’s current state from its other potential
                                                                           1 Automatically generated woodwind fingering diagram, string tabla-
states. Basic musical symbols, such as clefs and accidentals,           ture to staff pitch notation conversion, and automatically generated cues.
have been redesigned to implement this universal design                    2 The initial development of denm in Adobe Illustrator-targeted
                                                                   76
                                                                         The layout of denm is organized as a hierarchy of em-
                                                                       bedded boxes that recalculate their heights based on what
                                                                       the user elects to show or hide within them. Fig. 6 shows
                                                                       these vertically accumulating boxes. Each box defines its
                                                                       own padding, keeping layout separation consistent for each
    Figure 3. Design of beams and tuplet bracket ligatures             object.
SYSTEM
                                                                  77
        Figure 8. Screenshot of another Metrical Grid.
Performers often create click-tracks for learning, rehearsing,                   Figure 11. Screenshot of cue revelation.
and performing rhythmically complex music. Currently, the
Metronome Graphic objects can be played-back when a
performer taps on the time signature for a measure. The
                                                                              3. MUSIC ANALYSIS ALGORITHMS
Metronome Graphics “click” by flashing a different color
in time. An animated bar progresses from left to right at the         In order to provide performers with rehearsal tools in real-
speed prescribed by the current tempo of the music. This              time, robust analysis tools must be developed.
process has yet to be implemented with other objects in the
system, though this will continue to be developed.                    3.1 Metrical Analysis
  As development continues further, a performer will be               denm analyzes rhythms of any complexity. The result of
able to extract any portion of the musical part and rehearse          this analysis is an optimal manner in which to subdivide the
it with the visual click-track of the Metronome Graphics at           rhythm. Information like syncopation and agogic placement
any tempo. Ultimately, an audio element will be integrated            of events can be ascertained from this process. This process
into this metronome process, with sonic attributes mirror-            can be seen in Alg. 1.
ing those of the visual Metronome Graphics, to represent                Rhythm in denm is modeled hierarchically. The base ob-
subdivision-level and placement in the Metrical Analysis              ject in this model is the DurationNode. Any DurationNode
hierarchy (as described in Sec. 3.1).                                 that contains children nodes (e.g. traditional single-depth
                                                                      rhythm, or any container in an embedded tuplet) can be an-
2.4 Other Players’ Parts
                                                                      alyzed rhythmically. The result of this analysis of a single
Performers often notate aspects of the parts of the other             container node is a MetricalAnalysisNode (a DurationNode
players in an ensemble context. Because this information              itself with leaves strictly containing only duple- or triple-
already exists in the musical model, it can be graphically            beat durations). MetricalAnalysisNodes are the model used
                                                                 78
by the Metronome Graphics, the graphical representation                        any rhythm with a relative durational sum of the same value,
of which is described in Sec. 2.3.2.                                           the process of which can be seen in Alg. 2.
                                                                          79
3.2 Pitch Spelling                                                             create a new measure     #
                                                                       create a new rhythmic group      9,16 FL
                                                                        relative duration of event          2 --
                                                                       create an embedded tuplet                 4 p 74.5 d pp a -
                                                                       simply by indenting the next              2 p 79.75
                                                                                            events.              1 p 83 d p a >
Effective pitch spelling is critical in the process of generat-                                                  2 p 69.25 a .
                                                                                                                 4 p 85.5 a >
ing staff notation for music that is algorithmically composed                                               2 p 84 d o a -
                                                                                                            3 --
or extracted from spectral analyses. Algorithms for pitch                                                        3 --
spelling within tonal musical contexts have been compared                Embedding can occur to any                   1 p 84.5 a .
                                                                                             depth.                   2 p 88 d p a .
by Meredith [23] and Kilian [24]. The musical contexts                                                                1 p 67 a >
                                                                                                                      2 p 82.25 a .
that denm is most immediately supporting are rarely tonal.                                                            1 p 85 a >
More often these musical contexts are microtonal. The                                                            2 p 61 d o a -
                                                                  80
 [6] J. Bresson, C. Agon, and G. Assayag, “Openmusic: Vi-            [20] P. Archbold, “Performing complexity.”
     sual programming environment for music composition,
                                                                     [21] D. Spreadbury, “Accidental stacking.” [Online]. Avail-
     analysis and research,” in Proceedings of the 19th ACM
                                                                          able: http://blog.steinberg.net/2014/
     international conference on Multimedia. ACM, 2011,
                                                                          03/development-diary-part-six/
     pp. 743–746.
                                                                     [22] W. A. Burnson, H. G. Kaper, and S. Tipei, Automatic
 [7] A. Agostini and D. Ghisi. What is bach? [On-
                                                                          Notation Of Computer-Generated Scores For Instru-
     line]. Available: http://www.bachproject.
                                                                          ments, Voices And Electro-Acoustic Sounds. Citeseer.
     net/what-is-bach
                                                                     [23] D. Meredith, “Comparing pitch spelling algorithms on
 [8] M. Laurson, M. Kuuskankare, and V. Norilo, “An
                                                                          a large corpus of tonal music,” in Computer Music Mod-
     overview of pwgl, a visual programming environment
                                                                          eling and Retrieval. Springer, 2005, pp. 173–192.
     for music,” Computer Music Journal, vol. 33, no. 1, pp.
     19–31, March 2009.                                              [24] J. Kilian, “Inferring score level musical information
                                                                          from low-level musical data,” Ph.D. dissertation, TU
 [9] M. Kuuskankare and M. Laurson, “Expressive notation
                                                                          Darmstadt, 2005.
     package-an overview.” in ISMIR, 2004.
                                                                     [25] E. Cambouropoulos, “Pitch spelling: A computational
[10] N. Didkovsky and P. Burk, “Java music specification
                                                                          model,” Music Perception, vol. 20, no. 4, pp. 411–429,
     language, an introduction and overview,” in Proceed-
                                                                          2003.
     ings of the International Computer Music Conference,
     2001, pp. 123–126.                                              [26] S. Cunningham, “Suitability of MusicXML as a For-
                                                                          mat for Computer Music Notation and Interchange,” in
[11] W. Burnson, Introducing Belle, Bonne, Sage. Ann Ar-                  Proceedings of IADIS Applied Computing 2004 Inter-
     bor, MI: MPublishing, University of Michigan Library,                national Conference, Lisbon, Portugal, 2004.
     2010.
                                                                81
   OSSIA: TOWARDS A UNIFIED INTERFACE FOR SCORING TIME AND
                         INTERACTION
                           ABSTRACT                                             events occurring at the same time; the work presented here
                                                                                removes this limitation. Here, we will initially present
The theory of interactive scores addresses the writing and
                                                                                the use cases for conditional branching, as well as sev-
execution of temporal constraints between musical objects,
                                                                                eral existing works of art which involve conditions. Then,
with the ability to describe the use of interactivity in the
                                                                                we will introduce new graphical and formal semantics, re-
scores. In this paper, a notation for the use of conditional
                                                                                searched during the course of the OSSIA project. Their
branching in interactive scores will be introduced. It is
                                                                                goal is to allow composers to easily make use of condi-
based on a high level formalism for the authoring of in-
                                                                                tional branching during the authoring of interactive scores.
teractive scores developed during the course of the OS-
                                                                                We will show the compliance with previous research on
SIA research project. This formalism is meant to be at the
                                                                                the same field, which allows for strong verification capa-
same time easily manipulated by composers, and translat-
                                                                                bilities. We will conclude by presenting the software im-
able to multiple formal methods used in interactive scores
                                                                                plementation of these formalisms in the upcoming version
like Petri nets and timed automaton. An application pro-
                                                                                0.3 of I - SCORE, which will be able to edit and play such
gramming interface that allows the interactive scores to be
                                                                                scenarios in a collaborative way.
embedded in other software and the authoring software, I -
SCORE, will be presented.
                                                                           82
2.1 Conditional works of art                                      quired, from touchscreen applications to full-fledged in-
                                                                  teractive installations. Several projects have been studied
Some of the most interesting cases happen in more recent
                                                                  in the scope of OSSIA, in order to make the creation of
times, with the advent of composers trying to push the
                                                                  new complex interactive applications more efficient by us-
boundaries of the composition techniques. John Cage’s
                                                                  ing the tools that are developed in this research project.
Two (1987), is a suite of phrases augmented with flexible
timing : “Each part has ten time brackets, nine which are         2.2 Existing notations for conditional and interactive
flexible with respect to beginning and ending, and one, the       scores
eight, which is fixed. No sound is to be repeated within a
bracket.”. The brackets are of the form : 20 0000 ↔ 20 4500       We chose to compare the existing notations in a scale that
and are indicated at the top of each sequence.                    goes from purely textual like most programming environ-
  Branching scores can be found in Boulez’s Third sonata          ments, to purely graphic like traditional sheet music. Sim-
for Piano (1955–57) or in Boucourechliev’s Archipels (1967-       ilarily, there are multiple ways to define interactivity and,
70) where the interpreter is left to decide which paths to        consequently, multiple definitions of what is an interactive
follow at several points of bifurcation along the score. This     score.
principle is pushed even further in the polyvalent forms            The programmatic environments generally take a preex-
found in Stockhausen’s Klavierstücke XI (1957) where dif-         isting programming language, like LISP, and extend it with
ferent parts can be linked to each other to create a unique       constructs useful for the description of music. This is the
combination at each interpretation. Some of these compo-          case with for instance Abjad [5], based on Python and
sitions have already been implemented in computers, how-          Lilypond, a famous music typesetting software based on
ever it was generally done in a case-by-case basis, for in-       a TEX-like syntax. There are also programming languages
stance using specific Max/MSP patches that are only suit-         more axed towards interpretation and execution of a given
able for a single composition. The use of patches to record       score, which can take the form of the program itself. This
and preserve complex interactive musical pieces is described      is the case with Csound and CommonMusic [6]. In gen-
in [3].                                                           eral, programming languages of this kind offer a tremen-
  The scripting of interactive pieces can also be extended        dous amount of flexibility in term of flow-control. How-
towards full audio-visual experiences, in the case of artistic    ever, they require additional knowledge for the composer
installations, exhibitions and experimental video games.          to write scores with it.
Multiple case studies of interactive installations involving        The purely graphic environments allow compositions of
conditional constraints (Concert Prolongé, Mariona, The           scores without the need to type commands, and are much
Priest, Le promeneur écoutant) were conducted in the OS-          closer to traditional scores. For instance, multiple Max/MSP
SIA project. Concert Prolongé (i.e. extended concert) of-         externals, Bach for Max/MSP [7], note~ 3 , rs.delos 4 and
fers an individual listening experience, controllable on a        MaxScore [8] allow to write notes in a piano roll, timeline,
touchscreen where the user can choose between different           or sheet music from within Max. But they are geared to-
"virtual rooms" and listen to a different musical piece in        wards traditional, linear music-making, even if one could
each room, while continuously moving his virtual listen-          build a non-linear interactive song by combining multiple
ing point – thus making him aware of the (generally unno-         instances, or sending messages to the externals from the
ticed) importance of the room acoustics in the listening ex-      outside.
perience. Mariona [4, section 7.5.3] is an interactive ped-         Finally, there is a whole class of paradigms that sit be-
agogic installation relying on automatic choices made by          tween the two, with the well-known "patcher"-like lan-
the computer, in response to the users behaviours. This in-       guages: PureData, Max/MSP, OpenMusic [9], PWGL [10].
stallation relies on a hierarchical scenarization, in order to    These software work in term of data-flow : the patch rep-
coordinate its several competing subroutines. The Priest          resents an invariant computation which processes control
is an interactive system where a mapping occurs between           and/or audio data. In each case, it is possible to work
the position of a person in a room, and the gaze of a vir-        purely graphically, and flow control is generally imple-
tual priest. Le promeneur écoutant 2 (i.e. the wandering          mented as a block that acts on this data ([expr] in Pd/-
listener) is a stand-alone interactive sound installation de-     Max or [conditional] and [omif] in OpenMusic,
signed as a video game with different levels of exploration,      for instance). These software all allow to use a textual
mainly by auditory means.                                         programming language to extend the capabilites or express
  In closing, interactive applications for exhibitions offer      some ideas more easily.
various situations in which conditional constraints are re-         3http://www.noteformax.net
                                                                    4http://arts.lu/roby/index.php/site/maxmsp/rs_
  2   http://goo.gl/et4yPd                                        delos
                                                             83
                                                                            ecution of the score that would lead to this case. This has
                                                                            practical implications especially when working with hard-
                                                                            ware, which can have hard requirements on the input data.
                                                                            This means that the notation will have to be grounded with
                                                                            solid formal semantics.
                                                                       84
description and execution of sound processes to occur di-               focus on the interactive scoring part, the interoperability
rectly in the score.                                                    being provided by the Jamoma Modular framework [20],
                                                                        which allows the use of multiple protocols, such as OSC
2.4.2 Temporal Concurrent Constraint Programming
                                                                        or MIDI.
Since the interactive scores can be expressed in terms of                 When comparing with the previous approaches for in-
constraints (A is after B), one of the recurrent ideas for              teractive scores (Acousmoscribe, Virage, i-score 0.2), the
their formalisation was to use Non-deterministic Tempo-                 OSSIA project tries to follow a “users first” philosophy :
ral Concurrent Constraint Programming (NTCC), since it                  the research work is shared and discussed with artists, de-
allows constraint solving. This approach was studied by                 velopers, and scenographers from the musical and theater
Antoine Allombert [15] and Mauricio Toro [4, 16].                       fields, and their use case serve as a basis for the focus of
  However, there are multiple problems, notably the impos-              the research. They are in turn asked to try the software and
sibility to compute easily the duration of a rigid constraint,          discuss about the implementation.
and the exponential growth of the computation time of con-                For instance, in the previous studies of interactive scores,
straint solving, which led to some latency in the implemen-             a mapping had to be done between the theoretical founda-
tation, making real-time operations impossible.                         tion (Petri nets, temporal constraints. . . ) and the domain
                                                                        objects with which the composer had to interact. This has
2.4.3 Reactive programming
                                                                        led to mismatches between the representation and the exe-
Due to the static nature of models involving Petri nets and             cution [17] of the score. The most prominent problem was
temporal constraints, a domain-specific language, R EAC -               the inability to express cleanly multiple synchronized con-
TIVE IS [17], was conceived in order to give dynamic prop-              ditions, and to route the time flow according to these condi-
erties to interactive scores. An operational semantic is de-            tions. The formalism also did not allow for boxes directly
fined using the synchronous paradigm, to allow both static              following each other in a continuous manner, and always
and dynamic analysis of the interactive scores. This also               required the existence of a relationship between them. In-
allows composers to easily describe parts of their score that           stead, in the OSSIA project, we tried to conceive high-level
cannot be efficiently represented in a visual manner.                   concepts that would allow a composer to easily write an
                                                                        interactive score, build a software over these concepts, and
2.4.4 Timed Automata
                                                                        then implement them on the basis of the formalisms pre-
The current focus of the research is put upon the investi-              sented in part. 2.4.
gation of models for the formal semantics of conditional                  The main concepts of interactive scores can be grouped in
constraints in interactive scores.                                      two categories: temporal elements and contents. The tem-
  This has been achieved using the extendend timed au-                  poral elements (scenarios, instantaneous events, temporal
tomata of UPPAAL. Timed Automata allow to describe                      constraints, conditional branching and hierarchy) allow to
both logical and temporal properties of interactive scores.             create the temporal and logical structure of the scenario,
Moreover, the shared variables provided by UPPAAL al-                   and the contents (states and processes) allow to give actual
lows to model the conditionals. They are also used for                  control over several kind of processes.
hardware synthesis, in order to target Field-Programmable
Gate Arrays (FPGAs) [18]. Real-time execution semantics                 3.2 Temporal elements
is implemented with this method.
  The problem of the implementation of loops is however                 In order to allow the composer to write interactive con-
still unresolved : it makes static analysis on the score harder,        ditional scores, it is necessary to provide temporal con-
since we hurt the reachability problem.                                 straints, to allow at least a partial ordering between the
                                                                        different parts of the score. This is done using four base
                                                                        elements : Node, Event, Constraint and Scenario. A Node
              3. THE OSSIA PARADIGM
                                                                        (Time Node) represents a single point in time. An Event
3.1 Presentation                                                        describes an instantaneous action. A Constraint describes
                                                                        the span of time between two given Events. Finally, the
OSSIA (Open Scenario System for Interactive Applica-
                                                                        Scenario structures the other elements and checks that the
tions) is a research project, presented in [19] and funded by
                                                                        temporal constraints are valid and meaningful.
the french agency for research (ANR). Its goal is to devise
methods and tools to write and execute interactive scenar-
                                                                        3.2.1 Scenario
ios. The two main objectives are to provide a formalisation
for interactive scenarisation and seamless interoperability             A Scenario is defined as an union of directed acyclic graphs.
with the existing software and hardware. This paper will                The vertices are Events and the edges are Constraints. The
                                                                   85
                                                                                      A                    B
                                                                                                           C(/x = 1)
                                                                                                                             D
                                                                                                           E(/x 6= 1)
                                                                                                                             F
   (b) A rigid constraint between two events. Minimum and              on the same Node are also evaluated and instantaneously
   maximum duration of the constraint are equal ; the date of          triggered (or discarded if their Condition is not met, see
   the end event is fixed with regards to the date of the start
   event.
                                                                       section 3.3.1).
                                                                       3.2.3 Constraints
   (c) A constraint with a non-null minimum and a different,           A Constraint represents a span of time. Due to the interac-
   non-infinite maximum                                                tive nature of the proposed paradigm, the span can change
                                                                       at execution time, like a fermata. When the author wants to
   (d) A constraint with a non-null minimum and an infinite            allow a Constraint to have a variable duration, he renders it
   maximum                                                             flexible. This means that the end of the Constraint depends
                                                                       on the Condition of its final Event.
   (e) A constraint with a null minimum and an infinite maxi-            A Constraint can be activated or deactivated: if it is de-
   mum. Instead of making the representation heavier by hav-           activated, it will not count for the determination of the ex-
   ing the dashes of the constraint continue indefinitely, we
                                                                       ecution span of its end event.
   chose to remove the rake to symbolize infinity.
                                                                         The graphical representation of a Constraint can change
             Figure 3. The OSSIA Graphical Formalism                   according to its minimum and maximum duration. The
                                                                       minimum m’s range is [0; +∞], and the maximum M ’s
direction is the flow of time. It allows to organize the other         range is [m; +∞]\{0}. In the user interface (introduced in
base elements in time.                                                 section 4), the duration is directly linked to the horizontal
 Scenarios follow these basic rules:                                   size and is visible on a ruler.
   • There can be multiple Events explicitly synchronized              The graphical formalism for these elements is presented in
     by a single Node                                                  fig. 3.
                                                                         The Node is a vertical line. An Event is a dot on a Node.
   • A Constraint is always started by an Event and fin-
                                                                       If there is a trigger on the Event, a small arrow indicates it.
     ished by another, distinct Event
                                                                       The colour of the arrow can change at run-time to indicate
  Events and Constraints are chained sequentially. Multiple            the current state of the trigger.
Constraints can span from a single Event and finish on a                 The Constraint is an horizontal line that represents a span
single Event, as shown in fig. 7. The operational semantics            of time, like a timeline. If the constraint is flexible, the
of these cases will be described later. This allows different          flexible part is indicated by dashes and a rake. When there
processes to start and/or stop in a synchronized manner.               is no maximum to the constraint, there is no rake.
                                                                  86
 A          B                                                                    /a/val == true &&
                                                                                 /another/val == false
            C(/x = 1)                            D
                                                                                 which will trigger when the parameters will both be
                                                 G(/y = 1) H                     set (not necessarily at the same time) to the required
                                                                                 values.
                                                 I(/y 6= 1) J
                                                                             Higher-level operations, like a mouse click on a GUI can
                                                                           then be translated in conditions on Events, in order to bring
                                                                           rich interaction capabilities to the software dedicated to the
            E(/x 6= 1)           F                                         execution of the scores.
                                                                      87
Require: We enter the evaluation range for a Node n
                                                                                   A                         B
  if all(n.eventList(), Event::emptyCondition) then
       repeat                                                          C
           wait()
       until n.DefaultDate                                                                                                      D
       for Event e in n.eventList() do
           e.run()
                                                                                       E                     F
       end for
  else                                                                                                       G              H
       repeat
           if any(n.eventList(), Event::isReady) then                                                                               I
               nodeWasRun ← true
               for Event e in TimeNode.eventList() do
                   if validate(e.condition()) then                   Figure 7. A complete example of Node in the OSSIA graphical formal-
                                                                     ism. Two constraints converge on the Event B, and three constraints
                       e.run()                                       branch from the Node that synchronizes B, F, G. The durations are al-
                   else                                              ready processed by the constraint solver.
                       e.disable()
                                                                                   CondB
                   end if
               end for                                                     ABmin
           end if                                                          CBmin
                                                                                                                                        BDmin
  end if
            Figure 6. Execution algorithm for a Node                               CondF             Tsync
EFmin GHmin
3.4 Contents                                                           There can be two possible executions for processes: they
An Event may contain a State, which contains messages,               can do something on each tick of a scheduler; or they can
and can itself hierarchically contain other States. At a con-        send start and stop signals and behave however they want
ceptual level, for the composer, a State generally represents        in-between.
a change of state in a remote or local device, i.e. a discon-         Processes can share data with the Events at the beginning
tinuity.                                                             and the end of the parent Constraint, by putting them in
  A Constraint also acts as a container: during its execu-           specific States.
tion, several Processes can be executed in parallel.                  The API provides ways for somebody to implement his
  The two main Processes are:                                        own processes and use them afterwards in scores.
                                                                88
                                                                                dependency graph is shown in fig. 10.
                                                                                 The different sub-projects are:
                                                                           89
  However, in order to achieve more expressive power, we              [6] H. Taube, “An Introduction to Common Music,” Com-
still need to find a way to implement loops. Two approaches               puter Music Journal, pp. 29–34, 1997.
are currently being studied: one using a Loop process, and
                                                                      [7] A. Agostini, D. Ghisi, and C.-C. de Velázquez, “Ges-
another using a concept of goto; once one is chosen, we
                                                                          tures, Events, and Symbols in the Bach Environment,”
will try to find a relevant graphical element to present it.
                                                                          Actes des Journées d’Informatique Musicale, pp. 247–
  Furthermore, there could be some interest in the speci-
                                                                          255, 2012.
fication and implementation of variables, which could al-
leviate the need for an adjacent software like Max/MSP to             [8] N. Didkovsky and G. Hajdu, “MaxScore: Music nota-
perform complex logical computations. This would maybe                    tion in Max/MSP,” in Proceedings of the International
pave the way towards a time-oriented Turing complete pro-                 Computer Music Conference, 2008, pp. 483–486.
gramming language, with a simple graphical representa-
tion which would allow composers to write complex scores              [9] J. Bresson, C. Agon, and G. Assayag, “OpenMusic: vi-
in an understandable way. Another track is the imple-                     sual programming environment for music composition,
mentation of an audio engine, for instance by embedding                   analysis and research,” in Proceedings of the 19th ACM
FaUST 6 , in order to be able to produce sound directly                   international conference on Multimedia. ACM, 2011,
from i-score. The relevant parameters would then be ex-                   pp. 743–746.
posed and controlled within i-score.                                 [10] M. Laurson, M. Kuuskankare, and V. Norilo, “An
  The next step for the graphical formalism is to make us-                overview of PWGL, a visual programming environ-
ability studies in order to find the most convincing interac-             ment for music,” Computer Music Journal, vol. 33,
tions in the authoring software for the composers.                        no. 1, pp. 19–31, 2009.
                                                                90
[18] J. Arias, M. Desainte-Catherine, and C. Rueda, “Ex-
     ploiting Parallelism in FPGAs for the Real-Time Inter-
     pretation of Interactive Multimedia Scores.” submit-
     ted to Journées de l’Informatique Musicale, 2015.
                                                              91
                              A SIGN TO WRITE ACOUSMATIC SCORES
                                                         Jean-Louis Di Santo
                                                              SCRIME
                                                 [email protected]
                                                                       92
phenomenological point of view. The TARSOM works                    It means that a sign or a meaning must not have
with the TARTYP which aims at fixing acceptable sound               something in common with another. To aim these goals, it
objects. These sound objects are considered from their              is necessary to avoid analogic representation. This is the
beginning to their end.                                             reason why, for example, a small gait is represented by
                                                                    one curve and not by a small curve, a meddle gait by two
Linguistics
                                                                    curves and not by a meddle curve, and a big gait by three
My sign is based on the concept of minimal unit, or                 curves and not by a big curve.
discreet unit, that comes from linguistics (Benveniste,
                                                                    Temporal Semiotics Units
Jakobson). It refers to the smaller sonic element that
cannot be divided. For instance, a word can be divided              The morphological description of Temporal Semiotics
into syllables, a syllable can be divided into phonemes,            Units often describes a certain number of “phases”. That
but a phoneme cannot be divided: it is a minimal unit.              means that a big sonic unit can be constituted by several
This minimal unit is the result of the association of               small units and that each small unit has a value, like it has
different distinctive features. As linguistics and music are        in linguistics. These phases can be a process concerning
dealing with sounds, I applied this method to                       one sound parameter, or concerning several sound
electroacoustic music. This way I obtained smaller units            parameters at the same time. In fact, the idea of musical
than TARTYP units that can be combined to decribe                   minimal unit was born from a research that was aiming at
bigger units like phonemes can be combined to create                transforming these analytic tools in compositional tools 1.
syllables and syllables can be combined to create words. I
                                                                    Music Theory Notation
called electroacoustic sound minimal unit phase and
bigger units entity or group. The distinctive features of a         Of course, the music theory notation offers an excellent
phase are the sound features that are described in the              example of music notation: it is clear and as simple as it
TARSOM, and that I reorganised.                                     can be. It also uses minimal units and tries to indicate the
    Another idea I took from linguistics to build my sign is        most important features to realise sounds or analyse
to use a small amount of elements to create a great                 score. In a certain way, it is an open system because it
number of combinations. One can write several thousands             allows to add any sort of indication. The simplification of
words with only twenty six letters. This way, it prevents           the sound reduced to pitch and rhythmic parameters do
from having to memorize a great number of elements and              not prevent any other precision. The keys and key
it is easy to use them.                                             signatures allow to avoid the repetition of what does not
                                                                    change, and they are very ergonomic.
Ch. S. Peirce Sign Theory
Peirce defined a sign as a triadic relationship between the         Simplification Of TARSOM's criterions
object, the representamen and an its interpretant. Only             Criterions of Musical Perception
considering the relation between the representamen and
the object, he established three kinds of relation: icon,           In order to create a sign quite simple to read, I reduced
index or symbol. I wanted my sign to be easy to read: on            the seven criterions to four profiles: concerning criterions
the one hand, I used iconic representation every time I             of form, I established the dynamic profile and the
could because it is very easy to understand it: for                 rhythmic profile. Concerning criterions of matter, I
example, concerning dynamic profile, pitch increasing or            established the melodic profile and the harmonic profile.
decreasing and gait. On the other hand, in its symbolic             One finds here the four traditional dimensions of sound.
part that needs an interpretant and that is more complex            The term of profile refers to three kinds of processes:
for this reason, I used the same symbols applied to                 augmentation, diminution, or stability which is a
different parameters of the sound to represent the same             particular kind of process but not the only one. Why and
indications: dot means little, dash an dot middle and dash          how has the modification of Schaeffer criterions been
big, a broken line means random. This way I reduced the             done? In TARSOM or TARTYP, Schaeffer is describing
number of symbols one has to remind.                                sound objects constituted by several phases. Yet, I was
                                                                    interested only in one phase sound objects, but I took into
Nelson Goodman's Theory                                             account all of the possible variations. Schaeffer had
In his book Languages of art, Nelson Goodman was                    already described melodic profile and mass profile. He
comparing notation and art work. According to him, the              put them in the category of criterions of sound variations.
characteristics of notation are semantic and syntactic non          The TARSOM establishes seven categories: three
ambiguity. In other words, each sign must not be                    1
                                                                      Di Santo, Jean-Louis, “Composer avec les UST”, Vers une sémiotique
confused with another, and its interpretant must be clear.          générale du temps dans les arts, Actes du colloque "Les Unités
Other conditions are syntactical and semantic disjuncture.          Sémiotiques Temporelles (UST), nouvel outil d'analyse musicale :
                                                                    théories et applications", Sampzon, Delatour, 2008
                                                               93
categories regarding the sound itself (mass, dynamic,                process (this process commands the same modification or
harmonic timbre), two categories regarding variations                non-modification of the sound and can be applied to
(melodic profile and mass profile) and two categories                intensity, pitch, timbre or rhythm). Phase is the name I
regarding maintenance (grain and gait). If one looks at              gave to the minimal electroacoustic sound unit. This way,
the column two of the TARSOM, one can read on the                    one obtains 4 profiles that will be described later. Profile
“timbre harmonique” line: “lié aux masses” (linked to                is here the name of distinctive feature.
masses), and comparing the masse line to the timbre
harmonique line, one can almost see the same                         Complexification Of Mass/Harmonic Timbre
classifications. Thus I merged these two criterions into             I merged the schaefferian Mass and Harmonic Timbre
harmonic profile in order to simplify them. In the same              into the term of harmonic profile (in my sign, the species
way, always considering the column two, the                          of mass are mainly linked to melodic profile). The
“dynamique” and “profil de masse” lines are very similar             harmonic profile concerns the very matter of sound,
and I merged them into “dynamic profile”. However                    which does not depend on pitch, dynamic or other
Schaeffer was mostly describing two phase profiles:                  criterions about form. Schaeffer determined seven
following the concept of minimal unit which is based on              categories of sound considering this parameter (son pur,
a unity of process, I only considered one phase processes.           son tonique, groupe tonique, son cannelé, groupe nodal,
Here too, still taking into account one phase processes              nœud and bruit blanc). “Son pur” is sine curve, and “bruit
and adding the stability that was missing, I kept the                blanc” is white noise. They will not be taken into account
“profil mélodique” line. I also transformed the column 8             here, since they do not vary (except sine curve which
(impact) into “rhythmic profile”, with the caracteristics of         pitch can vary depending on its height, which is not our
slow, moderate and fast that are in the TARSOM, and I                purpose here). Thus five categories of sound remain.
added the processes of accelerando, of rallentando and               Their description, being very large, is very imprecise,
irregular. The rhythmic profile refers both to the internal          even if the number of categories is increased by the
speed of a sound and to iterative processes of the same              distinction between “simple” sounds and groups.
sound. This way, the four traditional musical criterions             According to Schaeffer, these five categories can be rich
were redefined. At last, I respectively linked maintenace            or poor. In EMS 11, in New York, I suggested to increase
criterions (grain and gait) to dynamic profile and melodic           these categories3. I determined three categories of
profile. However some criterions can be linked to some               homogeneous sounds, that can be rich or poor, and three
others: the rhythmic profile, that describes speed                   categories of hybrid sound that can be rich or poor too.
variations, can be as well applied to iteration or gait. The         Combining these categories in groups or sons cannelés
gait also often refers to melodic profile that contains the          (distonic sounds using Thoresen translation), and adding
idea of pitch, thus also the caliber, which is the difference        stable or filtered colours (bright, dark, hollow...), one can
between the lower and the higher frequency of the sound.             have 40 000 descriptions of harmonic profile
Grain is a particular variation that is applied to the
dynamic of sound. This way, some criterions that
disappear from the seven Schaeffer criterions reappear
applied to the four profiles.2
Criterions of Qualification/ Evaluation
As shown above, some of them are integrated to the four
profiles. The categories of species are integrated as
quantities: small, middle, big or random. The concept of             Figure 1 : Homogeneous sounds. The dash on each side of the symbol
random or irregular is very useful when some processes               always means rich. It will be the same, of course, concerning hybrid
                                                                     sounds.
are changing quickly in different ways: for example to
describe the sound of creaking wood which is sometimes
fast and sometimes slow.
Number Of Phases Of The Sound
In the TARSOM or The TARTYP, sounds can have
several phases. For the reasons I explained above my sign
describes sounds phase by phase. Phase refers to any kind
of sound, whatever its duration is, featuring the same
2
 http://www.ems-network.org/IMG/EMS06-                               3
                                                                      http://www.ems-
JLDSanto.pdf, p. 4-5                                                 network.org/IMG/pdf_EMS11_di_santo.pdf
                                                                94
   Hybrid sounds will be represented as below:                                     If the sounds of the group belong to two different
                                                                               categories, one will call it son cannelé (for example a bell
                                                                               sound is made by a first audible tonic sound and a thin
                                                                               inharmonic halo. Now, tonic sound and inharmonic sound
                                                                               belong to two different categories, so a bell produces a
                                                                               son cannelé. This sound will be represented by a tonic
                                                                               symbol under an inharmonic symbol). In order to have
Figure 2 : Hybrid sounds. A hybrid sound is a sound that haves features
                                                                               clearer signs, one will limit the number of symbols to two
from two homogeneous categories. For example a fly sound has features
of tonic sound and features of noise. Thus it is represented by a line         by group and three for son cannelé. To build all the
(tonic sound) made of dots (noise).                                            possibilities of son cannelé, one will use the table from
                                                                               Figure 4.
   The twelve simple signs described above will be used
                                                                                   There are also symbols to describe the “colour” of the
to build all the other signs, and particularly what one will
                                                                               sound, if it is more or less dark or bright. The symbol is a
call “group” and “son cannelé”. One will call “group”
                                                                               dot put on symbols of harmonic profile, except for noise
sounds of the same category combined between them. A
                                                                               that can't have a colour because of its very rich spectrum.
group made of homogeneous sounds will be called
homogeneous group and a group made of one or two
hybrid sounds will be called hybrid group. The sign that
represents a group is made of two symbols. The lower
one represents the sound one hears the most (called
fundamental), and the higher one represents the sound
                                                                               Figure 5 : Seven different “colours” of sound put on the symbol of tonic
one hears the less or as much as the other (called                             sound (the same thing can be done for inharmonic sounds). From left to
harmonic).                                                                     rignt: equilibrated sound, strong low frequencies, weak high
                                                                               frequencies, strong medium frequencies, weak low frequencies, weak
                                                                               medium frequencies and strong high frequencies. These colours can be
                                                                               filtered and can change but I don't reproduce the symbols here.
                                                                          95
    If the dynamic profile irregularly varies, a broken line        Number of combinations
is added at the top of one of these figures.                        A complete sign is made assembling symbols on the
                                                                    different sides of the dynamic profile or putting them
                                                                    inside.
Rhythmic Profile
It concerns the internal speed variation of sound or its
speed iteration (acceleration, deceleration or rhythm,
allure or grain’s stability). It is notated by dots, dashes         Figure 8 : An example of complete sign. The triangle shows the
and dots or dashes at the bottom of the rhythmic profile,           dynamic profile and means straight attack. The line at the top of this
as explained above. If the rhythmic profile irregularly             figure shows that there is no grain. At the left one can see the melodic
                                                                    profile in a medium tessitura. This full line shows a large caliber ; this
varies, a broken line is added at the bottom of the figure.         line is curve, that means that there is a gait. This gait is small because
A vertical dash at the beginning or the end of the figure           there is only one curve. At the bottom of the figure, dots indicate that
means rallentando or accelerando.                                   the speed of this gait is fast (rhythmic profile). The broken line in the
                                                                    rhythmic profile means that this rhythm is irregular. At last, the two
                                                                    lines inside the figure represent the harmonic profile: they mean tonic
Melodic Profile
                                                                    group.
It concerns tessitura (pitch becoming higher, lower or
                                                                        A sign, to be efficient, must be precise. This precision
stable). It is represented by five dots on the left or rignt
                                                                    depends on the number of possibilities it offers. The
side of the figure that represents dynamic profile. The
                                                                    Acousmoscribe's sign offers five symbols for dynamic
lower one indicates very low tessitura, the one above, low
                                                                    profile, three symbols for grain and three symbols for
tessitura and so on until very high tessitura. A line is
                                                                    rhythmic profile. Dynamic or rhythmic profiles can be
attached to these dots to represent the tessitura of the
                                                                    regular or irregular, thus the number of possibilities is
sound. This line can be straight and horizontal if the
                                                                    doubled.
tessitura is always the same, or can also come up or down
                                                                        There are five possibilities of stable melodic profile,
if the pitch increases or decreases. This line is curved if
                                                                    ten possibilities describing increasing pitches (from very
the sound has gait. At last, this line indicates the caliber
                                                                    low to low, from very low to medium, from very low to
of the sound: a line made with dots if the caliber is thin,
                                                                    high, from very low to very high, from low to high and so
dash and dot if it is meddle, and a dash if it is large. The
                                                                    on...), and ten possibilities describing decreasing pitches.
same symbols can be applied to a curve. If the melodic
                                                                    Of course, all these possibilities can offer irregular
profile irregularly and quickly varies, a broken line is
                                                                    processes. The symbol supporting melodic profile also
added at the end of this symbol.
                                                                    represents caliber. There are three possibilities of caliber
                                                                    and, for each of it, the possibility of being irregular. If the
Harmonic Profile
                                                                    sound has a gait, the line representing melodic profile is
The term harmonic profile replaces the terms of Mass and            replaced by one, two or three curves, depending on the
Harmonic Timbre in the TARSOM. It concerns harmonic                 amplitude of the gait.
timbre: richer, poorer or stable. The harmonic profile is               At last there are three basic symbols for harmonic
represented by symbols inside the geometrical figure: a             profiles becoming six merging different features, and
line for a tonic sound, a curve for inharmonic sounds and           becoming twelve adding a symbol meaning “rich”. As
a dot for noise. The sound can be homogeneous or hybrid             already said above, adding the different colours and the
if it has the features of two different sorts of sound (see         possibilities of groups and dystonic sounds, there are
above). Each category, homogeneous or hybrid, can be                40.000possibilities of harmonic profile.
rich or poor.                                                           The different combinations of all these different
     Tonic and inharmonic sounds can have a colour (see             symbols allow approximately five billions possibilities to
above, EMS11, New York). The sign allows to represent               build a sign always easy to read.
seven stable colours and fourty two filtered colours.
     The combination of two symbols belonging the same                                 FROM SIGN TO SCORE
category represent a group (tonic, inharmonic or noise)
that can be homogeneous or hybrid. The combination of               The Acousmoscribe
two symbols belonging to different categories represent             The first step was the Acousmoscribe, an experimental
dystonic sounds.                                                    software created in 2009 that works on Mac OS 10.6: it
                                                               96
allows the creation of a sign assembling different
symbols that one can choose to represent different
profiles and parameters. The windows have two parts: the
left part represents tracks on which one can put the signs.
The right part is a palette where one can create the sign
assembling the symbols of the different profiles. One can
assemble the symbols to create a sign and then put it in
the tracks. This software was able to generate some
sounds corresponding to the sign. 5
                                                                                            REFERENCES
                                                                   [1] F. de Saussure, Cours de linguistique générale,
                                                                       Payot, 1972.
                                                                   [2] P. Schaeffer, Traité des objets musicaux, Seuil, 1966.
5
 Ibid.
6
 http://www.ems-
network.org/spip.php?article377
                                                              97
[3] L. Thoresen, “Spectromorphogical Analysis of
    Sound Objects, An adaptation of Pierre Schaeffer’s
    Typomorphology”, EMS 2006.
     http://www.ems-
     network.org/IMG/EMS06-LThoresen.pdf,
     p. 3
[4] D. Smalley, “La spectromorphologie, une
    explication des formes du son”, http://www.ars-
    sonora.org/html/numeros/numero08/08d.htm
[5] BLACKBURN, Manuella, “Composing from
    spectromorphological     vocabulary:    proposed
    application,    pedagogy      and      metadata”,
    http://www.ems-
    network.org/ems09/papers/blackburn.pdf
[6] BENVENISTE, Emile, Problèmes de linguistique
    générale, Gallimard, 1966.
[7] JAKOBSON, Roman, Essai de                         linguistique
    générale, T 2, Paris, Minuit, 1973.
[8] PEIRCE, Charles Sanders, Ecrits sur le signe, Paris,
    Seuil, 1978, p.p. 148,149.
[9] GOODMAN, Nelson, Langages de l’art, Paris,
    Hachette littératures, 2005 (édition originale :
    Jacqueline Chambon, 1990), pp.168-192.
[10] MIM, les Unités Sémiotiques                     Temporelles,
     Documents Musurgia, 1996.
[11] DI SANTO, Jean-Louis, “Composer avec les UST”, "Vers une
     sémiotique générale du temps dans les arts, Actes du colloque
     "Les Unités Sémiotiques Temporelles (UST), nouvel outil
     d'analyse musicale : théories et applications", Paris, Delatour,
     2008, pp.257-270
                                                                        98
    A PARADIGM FOR SCORING SPATIALIZATION NOTATION
                            ABSTRACT
                                                                                             1. INTRODUCTION
The present paper is a shortened version of the one pre-
sented at the ICMC/SMC2014 [1] where it was demon-                                Research on spatialization in music dates
strated that SSMN (Spatialization Symbolic Music Nota-                         from practices in early civilizations through today's con-
tion) research seeks to establish a paradigm wherein OSC                       temporary output. SSMN investigations concentrate on
(Open Sound Control) [2] and a Rendering Engine allow                          composers' means of expressing placement and/or motion
a musical score to be heard in divers Surround formats.                        in space (e.g. Stockhausen, Boulez, Brant), and of more
    The research team consists of composers, spatializa-                       recent methods of graphic representation proposed in
tion experts, IT specialists and a graphic designer. After                     various research centers (i.e. Ircam’s OpenMusic [3] &
having established a taxonomy identifying and classify-                        Antescofo [4], MIM’s UST [5], Grame’s ‘inScore’ [6]).
ing spatiality of sound with associated parameters, open                       During the past decade certain composers using WFS [7]
source software is being developed and tested by practi-                       and Ambisonics [8] pointed to the need of musical nota-
tioners in the field. Composers, utilizing dedicated graph-                    tion wherein graphic symbols and CWMN (Common
ic symbols integrated into a score editor, have full control                   Western Music Notation) could coexist on a time line
over spatialization characteristics. They can audition the                     along with audio rendering.
results and communicate their intentions to performers
(i.e. conductors, musicians, dancers, actors) as well as to                          2. DEFINING A SPATIAL TAXONOMY
all participants in the chain from rehearsal to perfor-                        The SSMN Spatial Taxonomy is an open-ended system-
mance.                                                                         atic representation of all musical relevant features of
    SSMN capitalizes on time-based phenomena: choreog-                         sound spatiality. It is organized as follows: basic units of
raphers can combine and synchronize sound and body                             the SSMN Spatial Taxonomy are called descriptors, i.e.
movement; installation artists can program interactively                       room descriptors and descriptors of sound sources. De-
visuals with audio manipulation; film and video can be                         scriptors can be simple or compound and are assumed to
enhanced with 3D sound effects and spatialized scores.                         be perceptually relevant. Simple descriptors denote all
SSMN focuses not only on musical composition, other                            single primary features relevant to sound spatiality and
performing and media arts or even game interaction de-                         can be represented as symbols. Compound descriptors are
sign, but is useful in academic contexts such as profes-                       arrays of simple descriptors used to represent more com-
sional training in conservatories and in musicological                         plex spatial configurations and processes. Structural op-
research addressing the perennity of spatialization in                         erations and behavioral interactions can be used to trans-
early electroacoustic music.                                                   form elements previously defined using descriptors or to
                                                                               generate new elements. Descriptors are progressively
                                                                               being implemented in the project when proven to be of
                                                                               general user interest. Although the taxonomy is classify-
Copyright: © 2015 Emile Ellberger et al. This is an open-access article        ing and describing sound in a three-dimensional space,
distributed under the terms of the Creative Commons Attribu-                   some objects and symbols are, for practical reasons rep-
tion License 3.0 Unported, which permits unrestricted use,                     resented in two dimensions. As this taxonomy contains a
distribution, and reproduction in any medium, provided the original            very systematic vocabulary it proves to be useful for
author and source are credited.
                                                                               other research projects related to 3D Audio currently
                                                                               under development at the ICST. To assure the validity of
                                                                          99
concepts within this taxonomy, the SSMN team has un-                                     gration with external programming environ-
dertaken the task of testing perception of sound spatiality                              ments.
elements both in 2D and 3D mode, with key questions
being what can be perceived or not, and under which
conditions.
                                                                      100
     5. INTER-APPLICATION COMMUNICATION                             spatialization effects to be integrated into 3D cinema.
The use of OSC (Open Sound Control) possibilities al-               Here a score for 9 instruments and electronics was origi-
lows messages to be directed to various target software             nally notated in a popular score editor. Initially the com-
modules.     Typically,   spatialization    data   from             poser created his personal symbols and spatialization
MuseScoreSSMN flows to an audio renderer-engine                     annotations, but was limited to hearing the results in a
capable of spatializing in various output formats, e.g.             stereo version. He now exported his score in MusicXML
Ambisonic B-Format, WFS, multi-channel encoded audio                format (notation only), and imported it into Muse-
files. OSC messages and RAW data are also routed to                 ScoreSSMN utilizing the SSMN spatialization symbols.
DAW (Digital WorkStation) or to programming envi-                   Then, the composition with accompanying audio files
ronments (e.g. SuperCollider, C-Sound, MaxMSP.) At                  was rendered in B-Format onto an Ambisonic speaker
this    time   exporting possibilities   include    Mu-             system. Having been able to audition the impact of the
sicXML and SVG.                                                     sound motion, he could consequently edit and modify
                                                                    various parameters of SSMN symbols to his taste and
      6. DEVELOPING THE RENDERING ENGINE                            allow for more coherent musical effects. Interestingly,
                                                                    Gillioz had little experience in spatialization at first and
Compatible with the Open Source Initiative for standard-            began by creating erratic sound movements – skips and
ized Max/MSP Module, the SSMN Rendering Engine has                  wide jumps at 8thnote-120BPM rate. His esthetics
been engineered to allow real-time spatialized audio                obliged him to modify the displacement rate (speed and
rendering and visual feedback for all SSMN activity.                distance). Having mastered the process, he modified the
Functionalities include OSC routing over UDP ports, and             score as necessary and gave us precious feedback. 2
user control of encoding and decoding in various formats;
the user determines speaker configuration, designs the              CHoreo by Melissa Ellberger, choreographer
distance characteristics and is able to select effects such
as reverb, air absorption, and Doppler. All audio activity          CHoreo was a simple case study demonstrating ad-
can be saved and reopened in common audio file formats.             vantages in using SSMN within a rehearsal context. A
Real time visual feedback allows the user to monitor                choreographer trained performers wearing portable loud-
single or multiple trajectories and sound placements in             speakers to move along trajectories in a hall. Sound files
2D/3D. An AUAmbi plug-in allows communication with                  projected from the portable loudspeakers accompanied
audio software that have AU implementation. In order to             the body movements. In play mode, MuseScoreSSMN
facilitate overall OSC control, a set of descriptions were          triggered sound files transmitted to the SSMN Rendering
created that would allow multiple cross-application                 Engine, all the while sending streams of OSC data con-
communication, also adaptable to other protocol context             trolling the 3D spatialization process. The performers
such as SpatDIF and MusicXMuse-SoreML (Figure 3).                   could execute their roles by following the printed
                                                                    MuseScoreSSMN; the learning process prior to an actual
                                                                    public presentation was greatly facilitated (Figure 4).
Figure 3. SSMN Rendering Engine main screen. Figure 4. CHoreo trajectory score.
                                                              101
                   8. CONCLUSION                                    software, and receive user input.3 The SSMN workflow is
At this stage of the “work-in-progress” of SSMN, its                shown below (Figure 5).
basic workflow is optimized for the user case in which
notation for instrumental music (often incorporating live
electronics) is introduced into a music editor and spatial-
ized audio rendering is a requirement. Other user cases
include the additional use of audio files managed within
DAW software. SSMN equally targets state of the art
venues, namely 3D cinema (with a great need for encap-
sulating height information into surround systems), 5.1
radio and web-based broadcasting (video, music and
radio theater productions), choreography notation, artistic
multi-media and interactive installations, surround CD,
DVD and Blu-Ray market, as well as game design. 
   An SSMN user group provides inestimable feedback.
Questions that are continuously taken into account con-
cern the type of strategies adopted, their usefulness, the
choice of symbols, the clarity and speed of recognition,
the flexibility offered by the tool set and overall user
friendliness. Performers and audio engineers note that
they find useful features that allow them to consult both a             Figure 5. Basic MuseScoreSSMN I/O workflow.
printed version of the score containing the SSMN sym-
bols as well as its electronic version allowing rendering           Acknowledgements
the symbols in an active timeline.                                  The SSMN research team is grateful for the assistance
   The potential of the prototype was also tested with              and support offered by the Swiss National Foundation for
several choreographers and their composers at Tanzhaus              Scientific Research, the members of Institute for Com-
Zurich. Results of the SSMN project have been incorpo-              puter music and Sound Technology – University of the
rated into the composition curriculum at the Zurich Uni-            Arts of Zurich, the Computer Music division of the Haute
versity of the Arts and have been presented at the Haute            École de Musique of Geneva, composers Vincent Gillioz,
École de Musique of Geneva. The actual experience with              Mathias Aubert, Adam Maor and participants of SSMN
the composers, interpreters and composition students has            courses and Tanzhaus choreographers.
shown that they have experienced increased awareness of
spatialization possibilities within their own creation pro-                               REFERENCES
cess and developed an augmented spatial listening acuity.
                                                                    [1] Ellberger, Toro-Perez, Schuett, et al., “Spatialization
A future SSMN goal addresses developing awareness of                    Symbolic Music Notation at ICST”, Proceedings
spatialization through pedagogical interactive software                 ICMC|SMC, Athens, 2014, pp. 1120-25.
for all school ages as well as for pre-professional music
                                                                    [2] Wright, Freed, and Momeni, “OpenSound Control:
education. There also appears to be a need within musico-
                                                                        State of the art 2003”, Proceedings of the 2003 Confer-
logical research for archiving and assuring the perennity                ence on New Interfaces for Musical Expression (NIME-03),
of electroacoustic music, transcribed with symbols for                   Montreal, Canada, 2003.
study purposes. It is also expected that the SSMN project           [3] Agon, Assayag, and Bresson (Eds.), The OM Com-
will contribute to generating a sustainable impact on                  poser's Book Vol. 1., Collection Musique/Sciences,
creative processes involving three-dimensional spatializa-             Editions Delatour France/IRCAM, 2006.
tion.                                                               [4] Cont, Giavitto, and Jacquemard, “From Authored to
   Further aspects are also being investigated such as the              Produced Time in Computer-Musician Interactions”,
integration within the MusicXML protocol and SpatDIF                    CHI 2013, Workshop on Avec le Temps! Time, Tem-
compatibility (Peters, Lossius and Schacher 2013). The                  po, and Turns in Human-Computer Interaction, Paris,
SSMN tools set and documentation are available to the                   France, ACM, 2013.
scientific and artistic communities via a website that has          [5] Favori, Jean, “Les Unités Sémiotiques Temporelles”,
been setup to document project results, distribute the                  Mathematics and Social Sciences, 45e année, n°178,
                                                                        pp. 51-55.
                                                                    3
                                                                        http://blog.zhdk.ch/ssmn/
                                                              102
[6] Fober, Orlarey, and Letz, “Inscore - Un environne-
    ment pour la conception de Live Partitions”, Actes de
    la Conférence Audio Linux, BAC 2012,
[7] Theile, Günther, “Wave Field Synthesis – A Promis-
    ing Spatial Audio Rendering Concept”, Proceedings
    of 7th ICDAE, DAF’04, Naples, 2004.
[8] Daniel, Jerome (2000). Représentation de champs
   acoustiques, application à la transmission et à la re-
   production de scènes sonores complexes dans un con-
   texte multimédia. Thèse de doctorat de l’Université,
   Paris 6.
                                                            103
 ACCRETION: FLEXIBLE, NETWORKED ANIMATED MUSIC NOTATION
          FOR ORCHESTRA WITH THE RASPBERRY PI
                                                           K. Michael Fox
                                                    Rensselaer Polytechnic Institute
                                                [email protected]
                                                                           104
                     2. ACCRETION                                        These techniques would be assigned to each event as that
                                                                       event was generated and could apply to different event types
My compositional intentions with Accretion required each
                                                                       in different ways. For example, col legno battuto for a sin-
section of the orchestra to act independently of the others,
                                                                       gular event would be realized as a single strike of the bow,
facilitating the coordination of instrumental articulations
                                                                       while the durational event using col legno battuto would be
into clouds where each event had a seemingly arbitrary
                                                                       realized as multiple strikes of the bow over the course of
timing. The components and structure of the piece, being
                                                                       the specified time duration and articulated as fast as possi-
derived from granular synthesis, relied on events happen-
                                                                       ble.
ing in absolute time, as opposed to subdivisions of metrical
time based on tempo. Coordination of these events, then,
                                                                       2.3 Dynamics
form clouds or clusters that are partially identified by their
densities. However, since the instruments playing these                In the case of singular events, a single discrete dynamic is
sound grain-like notes are resonating bodies activated by              generated (as there is only one short note played). Dura-
humans, there are three additional components introduced:              tional events, however, are assigned continuous dynamics
pitch, playing technique and articulations, and dynamics.              that vary over the time duration of its articulation. Whether
Together, these components formed the main design con-                 the durational event is a sustained note or a collection of
siderations of the simulation system, programmed in C++                short notes, these are contained within a the continuous
with openFrameworks.                                                   dynamic envelope that makes each moment of that event
                                                                       vary in volume, intensity, and timbral quality.
2.1 Time & Pitch                                                         Dynamic envelopes for these durational events were based
                                                                       on the Attack, Decay, Sustain, and Release (ADSR) en-
The generative software system created “events” of two                 velopes of electronic sound synthesis. However, abstracted
types: singular or durational. The singular events were                from the synthesis function, each phase of the envelope can
realized as the shortest possible articulation of a note the           increase or decrease an Attack phase of an envelope need
instrument. Durational events, on the other hand, could be             not start at zero and can decrease before encountering the
long sustained notes or collections of staccato notes occur-           beginning of the Decay phase. The one exception to this
ring in a strictly defined time duration. Both event types             freedom is the Release phase, which will always approach
consisted of a single pitch assigned at the time of gener-             zero at the end of its duration.
ation. These pitch assignments are determined by an ac-                  These envelopes are applied to the durational events to
tive “global” pitch class, constraining all instrumental parts         vary the dynamics at any given moment between dal niente
to a pre-defined harmonic space, specifically the octatonic            (when possible on the instrument) and fff (available, but
(WH) scale and the whole-tone scale.                                   seldom reached). Since dynamic envelope is given to each
  Durational notes had an additional property when com-                durational event, each section of the orchestra is completely
prised of collections of short notes. These staccato notes             decoupled from the others with respect to crescendi or de-
were to be articulated as fast as possible at the prescribed           screscendi. This allows the ebbing and flowing of different
dynamic. The resulting effect is a slightly asynchronous               timbres over and under each other in graceful coordination
timing for the events resulting from the mechanical nature             (or, in reality, lack thereof).
of the action and the limitations of the human body. This
led to subtle emergent variation on the overlapping pat-               2.4 Notational Framework
terns of coexistent events, which was further amplified by
the use of different playing techniques.                               David Kim-Boyle has noted that “computer-generated scores,
                                                                       particularly those that employ real-time animation, create
                                                                       a heightened sense of anticipation amongst performers, es-
2.2 Technique
                                                                       pecially given that performance directive can change from
For each player part, divided by instrument sections, the              moment to moment” [4]. Similarly, Pedro Rebelo notes
events were assigned articulation techniques idiomatic to              that there is a delicate balance in animated notations be-
the instruments. Because the orchestra was comprised of                tween representing gestures too literally and too abstractly
student players, I limited the techniques to those that they           [5]. Given these two considerations and the goals of the
would feel comfortable and confident performing (i.e., not             piece, I believed that it was important to render the nota-
extended). A string instrument could perform events as                 tion in a form that was reasonably approachable by any of
one of: arco, pizzicato, col legno tratto, col legno battuto,          the performers. Like the use of idiomatic over extended
staccato, or tremolo; while winds would more uniformly                 techniques, I wanted to present the notation for my piece
play events legato.                                                    which functionally achieved the specific timings and re-
                                                                 105
                                              Figure 1. Reduction of concert score.
                                                                 106
                  Figure 4. Singular Event.
                                                                 107
Figure 7. Score rendering on linked Server (laptop, left) and         Figure 8. Stage setup in the Concert Hall at Rensselaer Polytech-
Client (large monitor, right) applications.                           nic Institute’s Experimental Media and Performing Arts Center.
                                                                      Monitors 2 (middle), 3 (right), and 4 (left) are visible.
                                                                108
vidual pieces. Indeed, the system I designed was motivated            [4] D. Kim-Boyle, “Real-time score generation for ex-
by the need to realize the compositional ideas that formed                tensible open forms,” Contemporary Music Review,
Accretion. However, I see many potentials for generalizing                vol. 29, no. 1, pp. 3–15, February 2010.
the tool along several of the trajectories outlined above to
                                                                      [5] P. Rebelo, “Notating the unpredictable,” Contempo-
further enable the compositional ideas I am interested in.
                                                                          rary Music Review, vol. 29, no. 1, pp. 17–27, February
  The main addition that I see for the system is inspired
                                                                          2010.
by another common usage of the RPi: the networked me-
dia browser. Common media browser applications include                [6] C. Hope and L. Vickery, “Screen scores: New
KODI (formerly XBMC). In the case of networked scores,                    media music manuscripts.” in PROCEEDINGS
this type of score access would require the development                   OF THE INTERNATIONAL COMPUTER MUSIC
and implementation of configuration flat-files to aid in the              CONFERENCE, ser. International Computer Music
networking functionality. The configuration file schema                   Conference, 2011, pp. 224 – 230. [Online]. Avail-
would theoretically allow scores to specify their role in the             able:   http://libproxy.rpi.edu/login?
network at runtime (such as the particular instruments they               url=http://search.ebscohost.com/
display parts for, in the context of Accretion for example),              login.aspx?direct=true&db=edsbl&AN=
or support preset values that can be implemented at device                CN080981465&site=eds-live&scope=site
startup.
                                                                      [7] G. D. Barrett and M. Winter, “Livescore: Real-time
  Similar benefits are represented by the ScorePlayer app,
                                                                          notation in the music of harris wulfson.” Contemporary
developed by the Decibel Ensemble. This implementation
                                                                          Music Review, vol. 29, no. 1, pp. 55 – 62, 2010. [On-
simply provides an alternative platform that enables a more
                                                                          line]. Available: http://libproxy.rpi.edu/
diverse configurability and extended low-level control over
                                                                          login?url=http://search.ebscohost.
notational structures, but at the expense of a less widely
                                                                          com/login.aspx?direct=true&db=aph&
distributed device (relative to iOS devices and their osten-
                                                                          AN=53772450&site=eds-live&scope=site
sible ubiquity). However, as stated above, the packaging
of scores on a standalone, pre-configured device, but re-
quiring little to no setup by performers, may increase the
potential for wider distribution.
Acknowledgments
Special thanks to Ryan Ross Smith and Nicholas DeMai-
son.
5. REFERENCES
                                                                109
                               SINGLE INTERFACE FOR MUSIC SCORE
                                SEARCHING AND ANALYSIS (SIMSSA)
                                                                          110
that SIMSSA will establish common standards and best                 station. Advanced techniques used to perform image
practices for these types of music information retrieval             restoration and automatic music transcription, however,
and serve as a baseline for future work in this field.               are computationally intensive, sometimes requiring hours
                                                                     or even days to run on personal computers. Instead, we
                 3. BACKGROUND                                       are developing systems where computationally intensive
                                                                     procedures can be distributed across many powerful
OMR research began in the late 1960s and has seen
                                                                     server machines attached to the Internet to perform pro-
limited but continuous interest with several commercial
                                                                     cessing in parallel, meaning any computer or mobile
software packages available (e.g., SmartScore and
                                                                     device with a modern web browser and access to the
SharpEye). Development of this technology has been
                                                                     Internet may act as a document recognition station, of-
slow, and most of the research on OMR has concentrated
                                                                     floading the computationally-intensive recognition tasks
on Common Western Notation, the most widely used
                                                                     to large clusters of computers in data centers. We see this
music notation system today (for a recent review,
                                                                     as our most significant technological contribution: these
see [1]). In the development stage of this project, we have
                                                                     techniques, known as distributed computing, are currently
created      a     site   for    the      Liber      usualis
                                                                     being explored in text-recognition research but have not
(http://liber.simssa.ca), to experiment with
                                                                     yet been explored for music recognition systems.
the methods and procedures of performing OMR on
                                                                       There is a successful precedent for projects of this
entire books containing older music notation. The website
                                                                     scope and scale. The IMPACT project [3] was a project
allows a user to search a digitized edition of this book
                                                                     funded by the European Union (€15.5 million, 2008–
using pitch names, neume names, and OCR-transcribed
                                                                     2012) that focused on digitization, transcription, and
text [2] (Figure 1). Our experience with the Liber project
                                                                     policy development for historical text documents. This
has reinforced the need for a robust and efficient work-
                                                                     project brought together national and specialized librar-
flow system for OMR.
                                                                     ies, archives, universities, and corporate interests to
                                                                     advance the state of the art in automatic text document
                                                                     transcription, explicitly for the purposes of preserving
                                                                     and providing access to unique or rare historical docu-
                                                                     ments. They have published significant advances in
                                                                     historical text recognition, tool development, policies,
                                                                     and best practices [4], [7].
                                                                       At the core of the IMPACT project was a networked
                                                                     and distributed document recognition suite, providing a
                                                                     common document recognition platform for all their
                                                                     partners across Europe. As the computer vision and
                                                                     software engineering teams developed new tools and
                                                                     algorithms to improve recognition, these were made
                                                                     available immediately to all partners simply by updating
                                                                     the online suite of tools. All partners could then supply
                                                                     realtime feedback and evaluation on these updates, com-
                                                                     paring them to previous techniques “in the field” and
                                                                     reporting their findings. The development teams then
                                                                     incorporated the feedback into further developments and
                                                                     refinements. This project has become self-sustaining and
                                                                     is now known as the IMPACT Centre of Competence, a
                                                                     not-for-profit organization that continues to build the
Figure 1.    Search the  Liber     usualis       website             technologies and best practices of the formally funded
(http://liber.simssa.ca) with the Mary had a little lamb             project. This represents a model that we hope to repro-
tune search.                                                         duce in the domain of music.
                                                               111
correction mechanisms, networked databases, and tools                The next logical step is to bring these systems to our
for analyzing, searching, retrieving, and data-mining                cloud-based OMR platform. This will allow us to distrib-
symbolic music notation. Responsibility for developing               ute the correction tasks to potentially thousands of users
these tools within the project is shared between the Con-            around the globe, thereby providing the means to collect
tent and Analysis axes.                                              large amounts of human correction data. This crowd-
  The Content axis is divided into three sub-axes: Recog-            sourced adaptive recognition system will be the first of its
nition, Discovery, and Workflow. The Recognition sub-                kind [12].
axis is responsible for developing the underlying technol-
ogies in machine learning and computer vision. The                   5.4 Content: Workflow sub-axis
Discovery sub-axis is responsible for large-scale web                The Workflow sub-axis is primarily responsible for
crawling, finding, and identifying images of books that              developing Rodan, the core platform for managing cloud-
contain musical content. Finally, the Workflow sub-axis              based recognition. Rodan is an automatic document
is responsible for developing user-friendly web-based                recognition workflow platform. Its primary function is to
tools that harness the technologies developed by the other           allow users to build custom document recognition work-
two sub-axes.                                                        flows containing document recognition tasks, such as,
  The Analysis axis is divided into two sub-axes: Search             image pre-processing and symbol recognition (Figure 2).
and Retrieval, and Usability. Searching music is complex             Rodan is capable of integrating many different recogni-
since, unlike text, it is not simply a string of characters:         tion systems, such as Aruspix and Gamera, with other
there are pitches, rhythms, text, multiple voices sounding           systems (e.g., integrating text recognition tasks for per-
simultaneously, chords, and changing instrumentation.                forming automatic lyric extraction) (see Figure 3). Once a
The Search and Retrieval axis is responsible for develop-            workflow has been created, Rodan manages digital doc-
ing ways of mining the notation data generated by the                ument images’ progression through these tasks. Users
Content axis in all its complexity, building on the work             interact with their workflows through a web application,
done in the ELVIS Digging into Data Challenge project                allowing them to manage their document recognition on
(http://elvisproject.ca). This axis is also developing               any Internet-connected device, but all tasks are actually
techniques for computer-aided analysis of musical scores.            run on the server-side. Storage and processing capabili-
The Usability sub-axis is responsible for studying retriev-          ties can be expanded dynamically, and new tasks can be
al systems and user behavior within the context of a                 seamlessly integrated into the system with no need for the
symbolic music retrieval system, identifying potential               users to update their hardware or software.
areas where the tools may be improved to suit real-world               Moreover, as a web-based system, Rodan can incorpo-
retrieval needs.                                                     rate many different methods for distributed correction or
                                                                     “crowd-sourcing” to provide human-assisted quality
5.2 Content: Discovery sub-axis                                      control and recognition feedback for training and improv-
Mass digitization projects have been indiscriminately                ing recognition accuracy. This follows a similar model to
digitizing entire libraries’ worth of documents—both text            that proposed by the IMPACT project where distributed
and musical scores—and making them available on                      proof-readers provide feedback. These proof-readers
individual libraries’ websites. The Discovery sub-axis is            correct any misrecognized symbols, and their corrections
developing a system that will automatically crawl mil-               will then be fed back into the recognition system, thereby
lions of page images looking for digitized books with                improving the recognition for subsequent pages and
musical examples [8]. When it finds a document contain-              documents. This type of crowd-sourced correction system
ing printed music it will use the OMR software to tran-              is employed in many text-recognition projects [13], [14],
scribe and index the music content for these documents.              but there are no such systems in development for musical
                                                                     applications. The success of crowd-sourcing as a viable
5.3 Content: Recognition sub-axis                                    means of collecting correction and verification data has
One of the major tasks of the Recognition sub-axis is the            been demonstrated by a number of projects, most notably
integration of two desktop open-source OMR software                  the Australian newspaper (TROVE) [15], Zooniverse
platforms: Gamera, a document analysis toolkit [9], and              [16] and reCAPTCHA [17]. Along with developing the
Aruspix, an advanced OMR system developed by Laurent                 technical mechanisms for crowd-sourced musical correc-
Pugin [10]. These systems are unique for their ability to            tions, the Workflow team is also working with the Usa-
“learn” from their mistakes by using human corrections               bility sub-axis on creating new ways to entice users to
of misrecognized symbols to improve their recognition                participate. Some ways of doing this would be to create a
abilities over time. We have shown this to be cost-                  game that rewards users with points or community credi-
effective in digitization and recognition workflows [11].            bility in exchange for performing work [18], or reframing
                                                               112
 musical tasks as simple non-musical tasks (e.g., shape or           — to identify whether, for example, a particular musical
 colour recognition) so that they become solvable by an              idiom is frequently used when the text refers to “God” or
 untrained audience. By diversifying the number of ap-               “love” — a type of search that is not possible with cur-
 proaches to collecting crowd-sourced correction data, we            rent systems. When the text-alignment task is complete,
 expect to appeal to a wide number of communities, from              the Recognition team will work with the Analysis team to
 specialists to the general public.                                  design and implement a search interface so that the users
   Later in this project, we will experiment with optical            can search music and text simultaneously.
 character recognition (OCR) for print and manuscript                  Many musical documents, especially those that are
 sources of music. By this point in the project we will              hundreds of years old, pose difficulties for computer
 have collected a large number of written texts with hu-             recognition due to faded inks, bleed-through, water, or
 man-transcribed ground-truth data. We will use this to              insect damage. Each of these problems is a potential
 train machine-learning algorithms to automatically rec-             source of transcription errors. The Recognition team is
 ognize the various text scripts present in these sources.           working on integrating the latest document-imaging
 Our goal here is to automatically align text with the               enhancement technologies, such as adaptive binarization,
 music above it, an important step that represents a signif-         bleed-through reduction, colour adjustment, and distor-
 icant challenge, and an avenue of research that has never           tion analysis and correction.
 before been explored. This will allow users to perform
 searches for recurring patterns that include music and text
   It is also important to have a robust modern file format          tive) format [19]. As part of the SIMSSA project we will
 to store all of the symbolic data representations of these          be forming a workgroup to enhance MEI support for
 musical documents to meet our needs. Based on previous              digital encoding of early notation systems for chant and
 work we have chosen the MEI (Music Encoding Initia-                 polyphonic music.
                                                               113
Figure 3 : Rodan’s Gamera symbol classifier interface. The symbols are from a 10th-century St. Gallen music manuscript.
  To evaluate Rodan and the accuracy of our OMR sys-                         each document, we will create a human-created transcrip-
tems, we have selected several manuscripts and early                         tion of the music notation. This data will be the “ground
printed scores that will be processed in order of increas-                   truth” against which we will evaluate the performance
ing difficulty for our tools. We have started with a selec-                  and accuracy of the OMR system (and later for the OCR
tion of Renaissance prints and late chant manuscripts,                       system). This will allow us to quantify any improvements
some of which are already available online (Figure 4).                       we make in our OMR systems as we develop new recog-
                                                                             nition methods. By making incremental modifications
                                                                             using different types of sources, we hope to build a robust
                                                                             system capable of processing a wide range of musical
                                                                             documents.
                                                                             5.5 Analysis: Search and Retrieval sub-axis
                                                                             The Search and Retrieval component of the Analysis axis
                                                                             will involve music historians and music theorists to
                                                                             investigate how computerized searches of large collec-
                                                                             tions of digital music can fundamentally change music
                                                                             history, analysis, and performance. They will develop
                                                                             new techniques for searching and analyzing digitized
                                                                             symbolic music. Searching music poses special chal-
                                                                             lenges. A search interface must be able to search for
                                                                             strings of pitches, rhythms, or pitches and rhythms com-
                                                                             bined, search polyphonic music for multiple simultaneous
                                                                             melodies or chords, and search vocal music for both text
                                                                             and music. Searching and retrieving are only the begin-
                                                                             ning, however; members of the Analysis axis are devel-
                                                                             oping software for many different types of computerized
Figure 4 : Salzinnes manuscript website,                                     analysis of large amounts of music. This will allow
http://cantus.simssa.ca.                                                     scholars to describe style change over time, discovering
 As we proceed, we will evaluate the strengths and                           which features of style stay the same and which change,
weaknesses of the workflow system, constantly adjusting                      or to describe what makes one composer’s music unlike
our methods before moving on to the next source. For                         that of his or her contemporaries. Musicians and students
                                                                             will be able to find all the different ways composers have
                                                                       114
harmonized a specific melody from the Middle Ages to                    Prototypes for a web-based editor for making correc-
the present. Representation of search and analysis find-                 tions or comparative editions of digital sources;
ings will be another focus of this axis, investigating new              A music exploration interface allowing quick and
methods for searching and retrieving millions of digitized               efficient content-based search and retrieval across a
music documents.                                                         large-scale notation database; and
  Recent projects such as the Josquin Research Project                  Advanced public, web-based music analytical tools.
(http://jrp.ccarh.org), Music Ngram Viewer
(http://www. peachnote.com) [20], and ELVIS                          Acknowledgments
project (now part of the SIMSSA project) are already
                                                                     This research is, in part, supported by the Social Sciences
searching millions of notes. All these projects, however,
                                                                     and Humanities Council of Canada and le Fonds de
have mostly depended on centralized, labour-intensive,
                                                                     Recherche du Québec sur la Société et la Culture, and
manual processes to transcribe the sources into symbolic
                                                                     McGill University.
notation, append metadata to the resulting files, and
arrange them in structured databases. SIMSSA will
                                                                                      7. REFERENCES
greatly streamline this process through automation and
distributed labour, and enable the sophisticated automatic           [1] A. Rebelo, I. Fujinaga, F. Paszkiewicz, A. R. S.
music analysis of very large corpora begun through                       Marcal, C. Guedes, and J. S. Cardoso, “Optical
ELVIS.                                                                   music recognition: State-of-the-art and open issues.”
                                                                         International Journal of Multimedia Information
5.6 Analysis: Usability sub-axis                                         Retrieval (March 2012), pp. 1–18, 2012.
Librarians and information scientists are leading the                [2] A. Hankinson, J. A. Burgoyne, G. Vigliensoni, A.
Usability sub-axis. They are continually review the                      Porter, J. Thompson, W. Liu, R. Chiu, and I. Fujinga.
usability of our tools: Rodan, search interfaces, crowd-                 “Digital document image retrieval using optical
sourcing interfaces, and analysis and visualization inter-               music recognition,” in Proceedings of the
faces, considering the needs and skillsets of many differ-               International Society for Music Information
ent types of users, from senior music scholars with little               Retrieval Conference, Porto, Portugal, 2012, pp.
technical expertise, to computer-savvy amateur musi-                     577–582.
cians, to choral directors and guitarists searching for              [3] H. Balk and L. Ploeger, “IMPACT: Working
sheet music.                                                             together to address the challenges involving mass
                                                                         digitization of historical printed text,” OCLC Systems
                 6. CONCLUSIONS                                          & Services, vol. 25, no. 4, pp. 233–248, 2009.
The most important outcome of this project is to allow               [4] V. Kluzner, A. Tzadok, Y. Shimony, E. Walach, and
users—scholars, performers, composers, and the general                   A. Antonacopoulos, “Word-based adaptive OCR for
public—to search and discover music held in archives                     historical books,” in Proceedings of the 10th
and libraries around the world. We expect that this will                 International Conference on Document Analysis and
fundamentally transform the study of music and allow a                   Recognition, Barcelona, Spain, 2009, pp. 501–505.
global audience of musicians and artists to discover
                                                                     [5] C. Neudecker, S. Schlarb, Z. Dogan, P. Missier, S.
previously unknown or overlooked pieces for perfor-
                                                                         Sufi, A. Williams, and K. Wolstencroft, “An
mance, making undiscovered repertoires that extend
                                                                         experimental workflow development platform for
beyond the classics available to the general public. We
                                                                         historical document digitisation and analysis,” in
also expect the public availability of large amounts of
                                                                         Proceedings of the Workshop on Historical
musical data to lead to significant advances in the field of
                                                                         Document Imaging and Processing, Singapore, 2011,
music theory and the birth of the long-awaited field of
                                                                         pp. 161–168.
computational musicology. Lastly, we expect that the free
and open-source tools we are developing will help lead               [6] Z. M. Dogan, C. Neudecker, S. Schlarb, and G.
significant advances in the following areas, all of which                Zechmeister, “Experimental workflow development
are either completely new or novel applications of exist-                in digitisation,” in Proceedings of the International
ing technologies:                                                        Conference on Qualitative and Quantitative Methods
 Public, web-based tools for historical image restora-                  in Libraries, Chania, Greece, 2010, pp. 377–384.
     tion;                                                           [7] A. Antonacopoulos, D. Bridson, C. Papadopoulos,
 Public, web-based distributed (“cloud”) processing                     and S. Pletschacher, “A realistic dataset for
     tools for OMR and OCR;                                              performance evaluation of document layout
 A large database of automatically transcribed music;                   analysis,” in Proceedings of the International
                                                               115
    Conference on Document Analysis and Recognition,               [19] A. Hankinson, P. Roland, and I. Fujinaga, “MEI as a
    Barcelona, Spain, 2009, pp. 296–300.                               document encoding framework,” in Proceedings of
[8] D. Bainbridge and T. Bell, “Identifying music                      the International Society for Music Information
    documents in a collection of images,” in Proceedings               Retrieval Conference, Miami, FL, 2011, pp. 293–
    of the International Conference on Music                           298.
    Information Retrieval, Victoria, BC, 2006, pp. 47–             [20] V. Viro, “Peachnote: Music score search and
    52.                                                                analysis platform,” in Proceedings of the
[9] K. MacMillan, M. Droettboom, and I. Fujinaga,                      International Society for Music Information
    “Gamera: A structured document recognition                         Retrieval Conference, Miami, FL, 2011, pp. 359–
    application development environment,” in                           362.
    Proceedings of the International Conference on
    Music Information Retrieval, Bloomington, IN,
    2001, pp. 15–16.
[10] L. Pugin, J. Hockman, J. A. Burgoyne, and I.
    Fujinaga, “Gamera versus Aruspix: Two optical
    music recognition approaches,” in Proceedings of the
    International Conference on Music Information
    Retrieval, Philadelphia, PA, 2008, pp. 419–424.
[11] L. Pugin, J. A. Burgoyne, and I. Fujinaga, “Reducing
    costs for digitising early music with dynamic
    adaptation,” in Proceedings of the European
    Conference on Digital Libraries, Budapest, Hungary,
    2007, pp. 471–474.
[12] S. Charalampos, A. Hankinson, and I. Fujinaga,
    “Correcting large-scale OMR data with
    crowdsourcing,” in Proceedings of the International
    Workshop on Digital Libraries for Musicology,
    London, UK, 2014, pp. 88–90.
[13] H. Goto, “OCRGrid: A platform for distributed and
    cooperative OCR systems,” in Proceedings of the
    International Conference on Pattern Recognition,
    Hong Kong, 2006, pp. 982–985.
[14] G. Newby and C. Franks, “Distributed
    proofreading,” in Proceedings of the Joint
    Conference on Digital Libraries, Houston, TX, 2003,
    pp. 361–363.
[15] R. Holley, “Extending the scope of Trove: Addition
    of E-resources subscribed to by Australian libraries.”
    D-Lib Magazine, vol. 7, no. 11/12, 2011.
[16] K. D. Borne and Zooniverse Team, “The Zooniverse:
    A framework for knowledge discovery from citizen
    science data,” in The Proceedings of the American
    Geophysical Union Fall Meeting, 2011.
[17] L. Von Ahn, B. Maurer, C. McMillen, D. Abraham,
    and M. Blum, “reCAPTCHA: Human-based
    character recognition via web security measures,”
    Science, vol. 321, no. 5895, pp. 1465–1468, 2008.
[18] L. Von Ahn, “Games with a purpose,” Computer,
    vol. 39, no. 6, pp. 92–94, 2006.
                                                             116
                                              BROWSING SOUNDSCAPES
                                                                           117
a central position in the music formalization itself [5]. As             All these temporal representations are more or less in-
long as events are identified though, we can assume that               formative depending on the evolution of the sound upon
the previous soundscape-oriented considerations hold for               the considered duration. In particular, in the case of field
musical audio files as well.                                           recordings, they are often barely informative.
  The TM-chart [6] is a tool recently developed to provide
compact soundscape representations starting from a set of
sound events. This representation constitutes a bridge be-
tween physical measures and categorization, including acous-           2.2 Browsing Sound Databases
tic and semantic information. Nevertheless, the creation of
a TM-chart relies on manual annotation, which is a tedious             On a majority of specialized websites, browsing sounds is
and time-consuming task. Hence, the use of TM-charts in                based on textual metadata. For instance, freeSFX 2 clas-
the context of big data sets or for online browsing applica-           sifies the sounds by categories and subcategories, such as
tions seems unthinkable.                                               public places and town/city ambience. In a given subcate-
  Besides sound visualization, automatic annotation of au-             gory, each sound is only described with a few words text.
dio recordings recently made significant progress. The gen-            Therefore, listening is still required to select a particular
eral public has recently witnessed the generalization of speech        recording.
recognition system. Significant results and efficient tools              Other websites, such as the Freesound project, 3 add a
have also been developed in the fields of Music Informa-               waveform display to the sound description. In the case of
tion Retrieval (MIR) and Acoustic Event Detection (AED)                short sound events, this waveform can be very informative.
in environmental sounds [7], which leads us to reckon with             On this website it is colored according to the spectral cen-
sustainable AED in the coming years.                                   troid of the sound, which adds some spectral information
  In this paper, we propose a new paradigm for soundscape              to the image. However, this mapping is not precisely de-
representation and browsing based on the automatic iden-               scribed, and remains more aesthetic than useful.
tification of predefined sounds events. We present a new
                                                                         The possibility of browsing sounds with audio thumbnail-
approach to create compact representations of sounds and
                                                                       ing has been discussed in [10]. In this study, the authors
soundscapes that we call SamoCharts. Inspired by TM-
                                                                       present a method for searching structural redundancy like
Charts and recent AED techniques, these representations
                                                                       the chorus in popular music. However, to our knowledge,
can be efficiently applied for browsing sound databases. In
                                                                       this kind of representation has not been used in online sys-
the next section we present a state of the art of online sound
                                                                       tems so far.
representations. The TM-chart tool is then described in
Section 3, and Section 4 proposes a quick review of Audio                More specific user needs have been recently observed
Event Detection algorithms. Then we present in Section 5               through the DIADEMS project 4 in the context of audio
the process of SamoCharts creation, and some applications              archives indexing. Through the online platform Telemeta 5 ,
with field recordings in Section 6.                                    this project allows ethnomusicologists to visualize specific
                                                                       acoustic information besides waveform and recording meta-
                                                                       data, such as audio descriptors and semantic labels. This
            2. SOUND REPRESENTATION                                    information aims at supporting the exploration of a corpus
                                                                       as well as the analysis of the recording. This website il-
2.1 Temporal Representations
                                                                       lustrates how automatic annotation can help to index and
From the acoustic point of view, the simplest and predom-              organize audio files. Improving its visualization could help
inant representation of a sound is the temporal waveform,              to assess the similarity of a set of songs, or to underline the
which describes the evolution of sound energy over time.               structural form of the singing turns by displaying homoge-
Another widely used tool in sound analysis and represen-               neous segments.
tation is the spectrogram, which shows more precisely the               Nevertheless, texts and waveforms remain the most used
evolution of the amplitude of frequencies over time. How-              and widespread tools on websites. In the next sections,
ever, spectrograms remain little used by the general public.           we present novel alternative tools, that have been specially
 While music notation for instrumental music has focused               designed for field recording representation.
on the traditional score representation, the contemporary
and electro-acoustic music communities have introduced
                                                                         2
alternative symbolic representation tools for sounds such                  http://www.freesfx.co.uk/
                                                                         3 https://www.freesound.org/
as the Acousmograph [8], and the use of multimodal infor-                4 http://www.irit.fr/recherches/SAMOVA/DIADEMS/
                                                                 118
                        3. TM-CHART                                              In this process, we can notice that the sound level of a
                                                                               segment is not exactly the sound level of its predominant
3.1 Overview
                                                                               source. Indeed the sound level of an excerpt depends upon
The Time-component Matrix Chart (abbreviated TM-chart)                         the level of each sound sources, and not only the predom-
was introduced by Kozo Hiramatsu and al. [6]. Based on a                       inant one. However, we assume that these two measures
<Sound Source × Sound level> representation, this chart                        are fairly correlated.
provides a simple visual illustration of a sonic environment
recording, highlighting the temporal and energetic pres-                       3.2.3 Creation of the TM-chart
ence of sound sources. Starting from a predetermined set                       We can now calculate the total duration in the recording (in
of sound events (e.g. vehicles, etc.), and after preliminary                   terms of predominance) and the main sound levels for each
annotation of the recording, the TM-chart displays percent-                    category of sound. From this information, a TM-chart can
ages of time of audibility and percentages of time of level                    be created.
ranges for the different sound sources. They constitute ef-                      Figure 2 shows a TM-chart based on the example from
fective tools to compare sonic environment (for instance                       Figure 1. It represents, for each category of sound, the
daytime versus nighttime recordings).                                          percentage of time and energy in the soundscape. The ab-
                                                                               scissa axis shows the percentage of predominance for each
3.2 Method
                                                                               source in the recording. For one source, the ordinate axis
Despite a growing bibliography [11, 12], the processing                        shows the duration of its different sound levels. For exam-
steps involved in the creation of TM-charts as not been pre-                   ple, the car-horn is audibly dominant for over 5 % of time.
cisely explained. We describe in this part our understand-                     Over this duration, the sound level of this event exceeds 60
ing of these steps and our approach to create a TM-chart.                      dB for over 80 % of time.
                                                                         119
           4. AUDIO EVENT DETECTION                                5.1 Samochart based on Event Segmentation
5. SAMOCHART
                                                             120
5.2 Samochart based on Confidence Values                             computation of the SamoChart. The code is downloadable
                                                                     from the SAMoVA web site 6 . It uses an object-oriented
Most Audio Event Detection algorithms actually provide
                                                                     paradigm to facilitate future development.
more information than the output segmentation. In the
                                                                       In order to facilitate browsing applications, we also chose
following approach, we propose to compute SamoCharts
                                                                     to modify the size of the chart according to the duration of
from the confidence scores of these algorithms.
                                                                     the corresponding sound excerpt. We use the equation 2 to
  We use for each target sound the temporal confidence val-
                                                                     calculate the height h of the Samochart from a duration d
ues outputted by the method, which can be considered as
                                                                     in seconds.
probabilities of presence (between 0 and 1). The curve on
Figure 4 shows the evolution of the confidence for the pres-                         
                                                                                                         if d < 1
ence of a given sound event during the analyzed recording.                           1
                                                                                     
We use a threshold on this curve, to decide if the sound                        h=     2          if 1 ≤ d < 10                 (2)
                                                                                     
                                                                                     
event is considered detected or not. This threshold is fixed                         2 × log (d) if d ≥ 10
                                                                                             10
depending on the detection method and on the target sound.
                                                                      We also implemented a magnifying glass function that
To obtain different confidence measures, we divide the up-
                                                                     provides a global view on the corpus with the possibility
per threshold portion in different parts.
                                                                     of zooming in into a set of SamoCharts. Furthermore, the
                                                                     user can hear each audio file by clinking on the plotted
                                                                     charts.
6. APPLICATIONS
play SamoCharts, which performs a fast and “on the fly” pageciess.html
                                                               121
                                                                                have been proposed in the last decade (see [19] for a re-
                                                                                view). Their goals are various, from giving people a new
                                                                                way to look at the world around, to preserving the sound-
                                                                                scape of specific places. However, as in general with sound
                                                                                databases, the way sounds are displayed on the map is usu-
                                                                                ally not informative. The use of SamoCharts on soundmaps
                                                                                can facilitate browsing and make the map more instructive.
6.3 SoundMaps
Other applications can be found from the iconic chart of
a soundscape. Soundmaps, for example, are digital ge-                           Figure 9. Analysis of the first melody of Ravel’s Bolérol (repetitions
                                                                                number 1, 4 and 9, and global analysis). The horizontal axis corresponds
ographical maps that put emphasis on the soundscape of                          to the percentage of time where a family of instrument is present. This
every specific location. Various projects of sound maps                         percentage is divided by the number of instruments: the total reaches
                                                                                100% only if all instruments play all the time. The vertical axis displays
  7 https://serv.cusp.nyu.edu/projects/                                         the percentage of time an instrument is played in the different nuances.
urbansounddataset/
                                                                          122
     7. CONCLUSION AND FUTURE WORKS                                  [6] K. Hiramatsu, T. Matsui, S. Furukawa, and
                                                                         I. Uchiyama, “The physcial expression of sound-
In this paper, we presented a new approach to create charts
                                                                         scape: An investigation by means of time component
for sound visualization. This representation, that we name
                                                                         matrix chart,” in INTER-NOISE and NOISE-CON
SamoChart, is based on the TM-chart representation. Un-
                                                                         Congress and Conference Proceedings, vol. 2008,
like TM-charts, the computation of SamoCharts does not
                                                                         no. 5. Institute of Noise Control Engineering, 2008,
rely on human annotation. SamoCharts can be created
                                                                         pp. 4231–4236.
from Audio Event Detection algorithms and computed on
big sound databases.                                                 [7] A. Mesaros, T. Heittola, A. Eronen, and T. Virtanen,
  A first kind of SamoChart simply uses the automatic seg-               “Acoustic event detection in real life recordings,” in
mentation of the signal from a set of predefined sound                   Proceedings of the 18th European Signal Processing
sources. To prevent eventual inaccuracies in the segmen-                 Conference, 2010, pp. 1267–1271.
tation, we proposed a second approach based on the confi-
dence scores of the previous methods.                                [8] Y. Geslin and A. Lefevre, “Sound and musical repre-
  We tested the SamoCharts with two different sound data-                sentation: the acousmographe software,” in Proceed-
bases. In comparison with other representations, Samo-                   ings of the International Computer Music Conference,
Charts provide great facility of browsing. On the one hand,              2004.
they constitute a precise comparison tool for soundscapes.           [9] D. Damm, C. Fremerey, F. Kurth, M. Müller, and
On the other hand, they allow to figure out what kinds of                M. Clausen, “Multimodal presentation and browsing of
soundscapes compose a corpus.                                            music,” in Proceedings of the 10th international con-
  We also assume that the wide availability of SamoCharts                ference on Multimodal interfaces. ACM, 2008, pp.
would make them even more efficient for accustomed users.                205–208.
In this regard, we could define a fixed set of color which
would correspond to each target sound.                           [10] M. A. Bartsch and G. H. Wakefield, “Audio thumbnail-
  The concepts behind TM-charts and Samocharts can fi-                ing of popular music using chroma-based representa-
nally be generalized to other kind of sonic environments,             tions,” Multimedia, IEEE Transactions on, vol. 7, no. 1,
for example with music analysis and browsing.                         pp. 96–104, 2005.
                                                               123
[16] P. Guyot, J. Pinquier, and R. André-Obrecht, “Wa-          [18] P. Guyot and J. Pinquier, “Soundscape visualization:
     ter sound recognition based on physical models,” in              a new approach based on automatic annotation and
     Proceedings of the 38th International Conference on              samocharts,” in Proceedings of the 10th European
     Acoustics, Speech, and Signal Processing, ICASSP.                Congress and Exposition on Noise Control Engineer-
     IEEE, 2013.                                                      ing, EURONOISE, 2015.
[17] J. Pinquier, J.-L. Rouas, and R. André-Obrect, “Robust     [19] J. Waldock, “Soundmapping. critiques and reflections
     speech/music classification in audio documents,” En-             on this new publicly engaging medium,” Journal of
     tropy, vol. 1, no. 2, p. 3, 2002.                                Sonic Studies, vol. 1, no. 1, 2011.
                                                               124
                     GRAPHIC TO SYMBOLIC REPRESENTATIONS OF
                               MUSICAL NOTATION
                     1. INTRODUCTION
The SCORE notation editor is the oldest music-
typesetting program in continual use. It was created at
Stanford University in the early 1970’s by Leland Smith
and initially was developed on mainframe computers
with output to pen plotters that was then photo-reduced
for publication. In the 1980’s SCORE was ported to IBM
PCs running MS-DOS with output to Adobe PostScript,
and later ported to Microsoft Windows. Due to the pro-
gram’s long-term stability and excellent graphical output,
many critical editions have been created over the years
using SCORE, such as the complete works of Boulez,
Verdi, Wagner, C.P.E. Bach, Josquin and Dufay.
      Throughout its history the SCORE editor has used a
simple and compact data format that allows forwards and
                                                                            Figure 1. SCORE data for bar 3 of Beethoven Op. 81a.
backwards compatibility between different versions of
the SCORE editor. The music representation system is                            Figure 1 illustrates music typeset in the SCORE editor
symbolic, but highly graphical in nature. Each notational                   along with data describing the third measure. Each line of
element is represented by a list of numbers that derive                     numbers represents a particular graphical element, such
their meanings based on their positions in the list. This                   as the circled first note of the third measure measure that
format was adapted from the one used in Music V soft-                       is represented on the second line in the data excerpt.
                                                                                The first four numbers on each line have a consistent
Copyright: © 2015 Craig Stuart Sapp. This is an open-access article         meaning across all notational items:
distributed under the terms of the Creative Commons Attribu-                             P1: Item type (note, rest, clef, barline, etc.).
tion License 3.0 Unported, which permits unrestricted use,                               P2: Item staff number on the page.
distribution, and reproduction in any medium, provided the original                      P3: Item horizontal position on the page.
author and source are credited.                                                          P4: Item vertical position on the staff.
                                                                      125
   Parameter one (P1) indicates the element type—in this                indicates the length of the stem with respect to the default
example 1=note, 5=slur, 6=beam, and 14=barline. The                     height of an octave. All other unspecified parameters
second number is the staff onto which the element is                    after the last number in the list are implied to be zero.
placed, with P2=1 for the bottom staff and P2=2 for the                 This means either a literal 0, or it may mean to use the
next higher staff on the page. The third parameter is the               default value for that parameter. For this example the
horizontal position of the item on the page, typically a                implied 0 of P9 indicates that the note has no flags on the
number from 0.0 representing the page’s left margin, to                 stem, nor are there any augmentation dots following the
200.0 for the right margin. In Figure 1, items are sorted               notehead.
by horizontal position (P3) from left to right on the page;                Multiple attributes may be packed into a single param-
however, SCORE items may occur with any ordering,                       eter value, such as P5 and P9 in the above example. This
which typically indicates drawing sequence (z-order)                    parameter compression was due to memory limitations in
when printing the items. P4 indicates the diatonic verti-               computers during the 1970’s and 1980’s. All values in
cal position on a staff, with positions 3, 5, 7, 9, and 11              SCORE data files use 4-byte floating-point numbers.
being the lines of a five-lined staff from bottom to top.               When a parameter can be represented by ten or fewer
      These first four numbers on a line give each item an              states, they are typically stored as a decimal digit within
explicit location on the page. The horizontal position is               these numbers. For example stem directions of notes are
an absolute value dependent on the printing area, while                 given in the 10’s digit of P5, while the accidental type is
the vertical axis is a hierarchical system based on the staff           given in the 1’s digit. In addition, the 100’s digit of P5
to which an item belongs: an item’s vertical position is an             indicates whether parentheses are to be placed around the
offset from the staff’s position on the page, and the staff             accidental, and the fractional portion of P5 indicates a
may have an additional offset from its default position on              horizontal offset for the accidental in front of the note.
the page.1                                                              The Windows version of the SCORE editor retains this
                                                                        attribute packing system, primarily for backwards com-
                                                                        patibility with the MS-DOS version of the program, since
                                                                        many professional users of SCORE still use the MS-DOS
                                                                        version of the program. This minimal data footprint
                                                                        could also be taken advantage of in low memory situa-
                                                                        tions such as mobile devices or over slow network con-
                                                                        nections.
                                                                               SCORE parameters have an interpreted meaning
                                                                        based on the item type and parameter number. With the
                                                                        advent of greater and cheaper memory in computers, the
                                                                        general trend as seen in XML data formats is to provide a
                                                                        key description along with the parameter data. Note that
                                                                        this is a trivial difference between data formats in terms
                                                                        of functionality, but is more convenient for readability
                                                                        and error checking. Below is a hypothetical translation of
Figure 2. Parameter values and meanings for a note.
                                                                        the SCORE note element discussed in Figure 2 that has
   The meaning of parameters greater than P4 depends on                 been converted into an XML-style element, providing
the type of graphical element being described. Objects                  explicit key/value pairs for parameters rather than the
with left and right endpoints (beams, slurs, lines) will use            fixed-position compressed parameter sequence :
P5 as the right vertical position and P6 as the right hori-                       <note>
zontal position. Figure 2 illustrates some of the higher                                    <staff>2</staff>
parameter positions for a note. In this example, P5 de-                                     <hpos>80.335</hpos>
scribes the stem and accidental display type for the note,                                  <vpos>5</vpos>
with “10” in this case meaning the note has a stem point-                                   <stem>up</stem>
ing upwards and that there are no accidentals displayed in                                  <accidental>none</accidental>
front of the note. P6 describes the notehead shape, with 0                                  <shape>solid </shape>
                                                                                            <duration>0.5</duration>
meaning the default shape of a solid black notehead. P7
                                                                                            <stem-length>2.5</stem-length>
indicates the musical duration of the note in terms of
                                                                                            <flags>0</flags>
quarter notes, such as 0.5 representing an eighth-note. P8                                  <aug-dots>0</aug-dots>
1
                                                                                  </note>
 For a detailed description of the layout axes, see pp. 7–10 of
http://scorelib.sapp.org/doc/coordinates/StaffPo
sitions.pdf
                                                                  126
   A translation of these note parameters into MusicXML            late the data based on either of these descriptions of the
syntax might look like this:                                       music. For example, data entry on each staff can be done
                                                                   independently, in which case the notes on each staff are
         <note default-x="13">                                     not aligned vertically. The SCORE program’s LJ com-
                 <pitch>                                           mand aligns the notes across system staves based on the
                                   <step>G</step>
                                                                   P7 durations, and this will cause the P3 values of notes to
                                   <octave>4</octave>
                                                                   match their rhythmic partners on other staves.
                  </pitch>
                  <duration>4</duration>
                  <voice>1</voice>
                  <type>eighth</type>
                  <stem default-y="25">up</stem>
                  <beam number="1">begin</beam>
        </note>
                                                             127
puter program in order to generate a correct interpretation           ships between graphical items have been analyzed using
of the notation. As an example of the inter-dependency                scorelib. Currently the scorelib codebase can convert
of these two steps, the OMR program SharpEye 2 is quite               SCORE data into MIDI, Humdrum, Dox, MuseData,
sensitive to visual breaks in note stems. Finding stemless            MusicXML and MEI data formats.5
noteheads often leads it to identifying the noteheads as                    The following sub-sections describe the basic order
double whole rests which roughly have the same shape as               of analyzing SCORE data in order to extract higher-level
a stemless black notehead. This is clearly a nonsensical              musical information needed for conversion into other
interpretation when occurring in meters such as 4/4 or                musical data formats.
against notes on other staves that do not have the same
duration as a double whole note. In such cases where                  2.1 Staves to Systems
interpretation stage yields such strange results, the identi-         SCORE data does not include any explicit grouping of
fication stage of a graphical element should be reconsi-              staves into musical systems (a set of staves representing
dered.                                                                different parts playing simultaneously). So when extract-
       SCORE’s data format can be considered a perfect                ing symbolic information from SCORE data, the first step
representation of the first stage in OMR processing where             is to group staves on a page into systems. Errors are
all graphical elements have been correctly identified.                unlikely to occur in this grouping process, since staves
Converting between a basic OMR representation of                      linked together by barlines are the standard graphical
graphical elements and SCORE data is relatively easy.                 representation for systems. In orchestral scores, parts
For example Christian Fremerey of the University of                   may temporarily drop out on systems where they do not
Bonn/ViFaMusik was able to write a Java program,                      have notes. In SCORE data, staves are give a part num-
called mro2score, within a few days that converts the                 ber so that printed parts can be generated from such
SharpEye’s graphical representation format into SCORE                 scores by inserting additional rests for systems on which
data.3                                                                the part is not present.
       The mro2score program essentially transcodes the
identification-stage of musical data from OMR identifica-             2.2 Systems to Movement
tion and adds minimal markup to convert into SCORE
                                                                      Once musical systems have been identified on a page in
data. In order to convert such symbols into musically
                                                                      SCORE (or with any raw OMR graphical elements), the
meaning syntaxes, more work is necessary. Most OMR
                                                                      identification of the sequence of systems across multiple
programs have built-in editors used to assist the correc-
                                                                      pages forming a full movement is necessary in order to
tion of graphic symbol identification as well as their final
                                                                      interpret items such as slurs and ties. These may be bro-
interpretation. Such editors function in a manner similar
                                                                      ken graphically by system line breaks. If a set of pages
to the SCORE editor, which can display graphical ele-
                                                                      describes a single work, this process is generally as trivial
ments containing syntactic errors such as missing notes,
                                                                      as the staves to systems identification; however, automat-
or incorrect rhythms. Most graphical notation editors
                                                                      ic identification of new movements/works will be de-
such as MuseScore, Sibelius or Finale require syntactical-
                                                                      pendent on the graphical style of the music layout. Typi-
ly correct data, so they are not as well suited to interac-
                                                                      cally indenting the first system indicates a new move-
tive correction of OMR data.
                                                                      ment/work, but this assumption is not always true. When
       In order to convert from SCORE data into more
                                                                      interpreting SCORE or OMR data, manual intervention
symbolic music formats, an open-source parsing library
                                                                      may sometimes be needed to handle non-standard or
and related programs called scorelib has been developed
                                                                      unanticipated cases in movement segmentation.
by the author.4 This library provides automatic analysis
of the relations between notational elements in the data,
                                                                      2.3 Pitch Identification
linking music across pages, grouping music into systems
and parts, linking notes to slurs and beams, as well as               Pitch identification takes extensive processing of the data.
interpreting the pitches of notes. This library is designed           The previous two steps linking staves into systems and
to handle the second stage in OMR conversions of                      systems across pages into movements must first be done
scanned music into symbolically manipulable musical                   before identifying pitch. The data must then be read
data. Conversion from SCORE, and by extension low-                    temporally system by system throughout the movement,
level OMR recognition data, into other more symbolic                  keeping track of the current key and resetting the spelling
data formats becomes much simpler once these relation-                of pitches at each barline for each part/staff. Figure 4
                                                                      illustrates the result of automatic identification of the
  2
      http://www.visiv.co.uk
  3                                                                     5
      http://www.ccarh.org/courses/253/lab/mro2score                       See http://scorelib.sapp.org/program for a list of
  4
      http://scorelib.sapp.org                                        available conversion and processing programs.
                                                                128
pitch sequence (g, g, d-flat, c, c, d-natural) for the top            the layer interpretation of the music from Figure 1. Since
staff of music in measure three of Figure 1.                          there are no more than two layers on any staff, automatic
                                                                      recognition of the layers is unambiguous. The first layer
                                                                      (as defined in most graphical music editors) is the highest
                                                                      pitched music in the measure with stems pointing up-
                                                                      wards if there is a second layer below it. In Figure 5, the
                                                                      second layers in measures 5 and 6 are highlighted in red
                                                                      (or gray in black-and-white prints). The circled rest on
                                                                      the bottom staff of measure 4 presents an interpretational
                                                                      ambiguity: either the bottom layer can be considered to
                                                                      drop out at the rest, or the rest can be interpreted as
                                                                      shared between the two layers on the bottom staff. When
                                                                      extracting orchestral parts in such situations, both parts
                                                                      would share the rest, and the extracted parts would both
                                                                      display the rest.
                                                                129
       3. DATA CONVERSION FROM SCORE                                          description capabilities. The complexity of the notation
SCORE uses a two-dimensional description of musical                           will determine the necessity of preserving layout infor-
notation, and its data can be serialized into any order                       mation when translating to other file formats. Simple
since items’ positions on the page are independent of                         music can automatically be re-typeset without problems;
each other. Nearly all other music-notation formats im-                       however, complex music is difficult to automatically
pose a sequential structure onto their data, typically                        typeset with a suitable readability quality, and usually
chopping up the score into parts, measures, and then                          human intervention is required to maximize readability in
layers, which form monophonic chunks that are serialized                      complex notational situations. Many music-notation
in different ways. This section presents some of the con-                     editing programs focus on ease of manipulation for the
versions available with sample programs accompanying                          musical layout and try to minimize the need for manual
the scorelib library.                                                         control. Likewise, they internally hide the layout infor-
    Figure 7 illustrcates three serialization methods within                  mation that would be necessary to convert into layout
measures that are commonly found in music-notation                            explicit representations such as SCORE data.
data formats. In Humdrum data, notes are always serial-                          Automatic layout will always fail at some point, since
ized by note-attacks times—in other words, all notes                          the purpose of music notation is to convey performance
from each part/layer played at the same time are found                        data to a musician in the most efficient means necessary.
adjacent to each other in the data. This configuration is                     Typesetting involves lots of rules and standards, but fre-
also true of Standard MIDI Files in type-0 arrangement,                       quently the rules will need to be broken, or conflicting
where all notes are presented in strict note-attack order.                    rules will override each other. Any confusion in the
Most other data formats will organize music into horizon-                     layout decreases the effectiveness of the notation, which
tal/monophonic sequences by measure rather than by                            a professional typesetter can deal with on a cognitive
vertical/harmonic slices. MEI chops up a score into a                         level much higher than a computer program. Being able
sequence of measures/parts/staves, and finally the staves                     to preserve the precise musical layout of SCORE (or
are segmented into a parallel sequence of monophonic                          OMR) data is very useful, since this can retain human-
layers. MuseData and MusicXML use the same serializa-                         based layout decisions.
tion technique within a measure, but layer segmentations
are not as hierarchical as MEI. MusicXML has two ways
of serializing measures in a score (partwise and time-
wise), but these methods do not affect serialization within
a measure.
                                                                        130
be displayed as red vertical lines within the editor as                          allows the musical content to be read directly from the
show in the screen capture at the bottom of Figure 8.                            representation more so than any other symbolic digital
These gridlines are calculated directly from the horizontal                      representation of musical notation that encode parts seri-
placement (P3) of notes when converting from SCORE                               ally rather than in the parallel fashion of Humdrum.
data. Within Dox data, the absolute horizontal positions                            The following text lists a conversion from the SCORE
are converted into incremental distances from the previ-                         data of Figure 1 into Humdrum syntax. Each staff is
ous composite rhythm time in the measure.                                        represented by column of data (spines), with staff layers
    Unlike SCORE data, the Dox format separates layout                           causing splits of the spines into sub-columns. Each line
information from symbolic musical elements. Figure 9                             of data represents notes sounding at the same time, so the
shows some sample Dox data illustrating this property.                           rows represent the composite rhythm of all parts, which is
At the start of the data for each system, a header gives                         similar to the rhythm sequence of grid directives in Dox.
layout information. The bars directive controls the abso-
lute positions of the measures within the system, and each                         **kern                **kern
                                                                                   *staff2               *staff1
grid directive controls the spacing between composite                              *clefF4               *clefG2
rhythm positions within each measure. For example                                  *k[b-e-a-]            *k[b-e-a-]
                                                                                   *M2/4                 *M2/4
“147x13” at the start of the grid for the first measure                            =1-                   =1-
means that the first beat is 147 spatial units from the start                      2r                    4e-/ 4g/
                                                                                   .                     4B-/ 4f/
of the measure (relatively wide, to allow for the system                           =2                    =2
clef and key signature to be inserted), then the next posi-                        4.CC/ 4.C/            4.G/ 4.e-/
                                                                                   8C/ 8E-/              (16.e-/LL
tion in the composite rhythm sequence is a sixteenth note                          .                     32a-/JJk)
later, and this is placed 13 units after the notes of on the                       =3                    =3
                                                                                   8BB-/ 8En/L           8g/L
first beat.                                                                        8BB-/ 8E/             (16.g/L
    The Dox editor manipulates note spacing by adjusting                           .                     32dd-/JJk)
                                                                                   8AAn/ 8F/             8cc/L
these grid points, so notes across multiple staves in a                            8AA-/ 8F#/J           (16.cc/L
system sounding at the same time are always vertically                             .                     32ddn/JJk)
aligned. Vertical positioning of staves as well as the size                        =4                    =4
                                                                                   *^                    *^
of staves are also stored in Dox data, so page layout can                          (8G/L     4.GG\       (8.ee-/L      8e-\L
be preserved when converting from SCORE data.                                      8An/      .           .             8e-\ 8f#\
                                                                                   .         .           16cc/k        .
                                                                                   8Bn/J)    .           8bn/J)        8d\ 8g\J
                                                                                   8r        8r          (16gg\LL      8r
                                                                                   .         .           16eee-\JJ)    .
                                                                                   *clefG2   *clefG2     *             *
                                                                                   *v        *v          *             *
                                                                                   =5        =5          =5
                                                                                   *^        *           *
                                                                                   8g/L      4.G\        8.eee-/L      8ee-\L
                                                                                   8an/      .           .             8ee-\ 8ff#\
                                                                                   .         .           16ccc/Jk      .
                                                                                   8bn/      .           8bbn/L        8dd\ 8gg\
                                                                                   8b-/J     8g\         8bb-/J        8dd\ 8gg\J
                                                                                   *v        *v          *             *
                                                                                   *                     *v            *v
                                                                                   *-                    *-
                                                                           131
rameters are interleaved within the data, typically being
given as element attributes. Figures 10 and 11 illustrate
conversions from SCORE into musicXML for a work by
Dufay generated by the score2musicxml converter. These
two figures highlight the page layout information that can
be preserved when translating between SCORE and mu-
sicXML. Both figures have the same system break loca-
tions, staff scalings and system margins. While mu-
sicXML 3.0 has the capability to specify the horizontal
layout of notes and measures, this information is current-
ly stripped out of the data when importing into the most
recent version of Finale (2014).
child element.
                                                                          Figure 12. A chord in SCORE format with translations into MEI and
                                                                          MusicXML below.
                                                                    132
layout for interactive editing.6 MusicXML is structurally                             4. CONCLUSIONS
based on the symbolic for of MuseData. The compiled                 SCORE is an important historical data format for com-
layout-specific format is analogous to SCORE data. A                puter-based music typesetting. Understanding its graph-
useful property of the MuseData printing system is access           ical representation system is particularly useful for pro-
to both the high-level symbolic representation as well as           jects in OMR, where interpreted graphical symbols must
the low-level graphical representation.                             be organized in a similar process as converting from
                                                                    SCORE into other data formats. In addition, the SCORE
3.6 SCORE and SVG
                                                                    representation system should be studied by projects writ-
Due to SCORE data’s graphical nature, converting it into            ing automatic music layout of purely symbolic data.
images is less complex than generating images from                  SCORE is primarily used by professional typesetters due
purely symbolic representations (outside of the intended            to its high-quality output and the degree of control af-
software for a representation, of course). While each               forded to the typesetter. Using the scorelib software
graphical element in SCORE can be placed independent-               allows SCORE data to be more easily converted into
ly at a pre-determined position in an image, software               other musical formats, usually with minimal manual
processing symbolic formats must first calculate a graph-           intervention and exactly preserving the original layout.
ical layout, and unlike MuseData this layout representa-
tion is typically inaccessible as an independent data for-
mat. While SCORE software does not have native export
to SVG images, minimal processing of its EPS output can
produce SVG images.7 Analytic overlays on the notation
image can be aligned to the image using the layout in-
formation from the original SCORE data.
   Since SCORE data is compact, it can be stored within
an XML files. For example the complete SCORE data
for the music of Figure 1 can be found in an SVG image
of the incipit used on the Wikipedia page for Beethoven’s
26th piano sonata.8 At the bottom of the SVG image’s
source code, the SCORE data used to create the SVG
image is embedded within a processor instruction using
this syntax:
      <?SCORE version=“4”
          SCORE data placed here
        ?>
  6
     The batch-processing version of the MuseData printing
system (http://musedata.ccarh.org) can be used to
generate both PostScript output and the intermediate layout
representation called Music Page Files (MPG).
  7
    Using the open-source converter https://github.com/-
thwe/seps2svg
  8
    http://en.wikipedia.org/wiki/Piano_Sonata-
_No._26_(Beethoven)
  9
    https://github.com/craigsapp/scorelib/tree/-
master/data
  10
     http://imslp.org/wiki/User:Craig
                                                              133
                           CODE SCORES IN LIVE CODING PRACTICE
                                                              Thor Magnusson
                                                             Department of Music
                                                         University of Sussex, Brighton
                                                       [email protected]
                                                                         134
those were often that they became quite verbose and the                               Alex McLean’s Tidal [7] is a high level mini-language
‘naturalness’ of their syntax was never so clear. Conse-                          built on top of Haskell. It offers the user a limited set of
quently, in a more familiar object oriented dot-syntax, the                       functionality; a system of constraints that presents a large
above might look like:                                                            space for exploration within the constraints presented [8].
                                                                                  The system focuses on representing musical pattern. The
           w = Sine("foo", [\freq, 440]);
                                                                                  string score is of variable length, where items are events,
           w.addEnvelope(\adsr);
                                                                                  but these items can be in the form of multi-dimensional
           p = Pattern("ping");
                                                                                  arrays, representing sub-patterns. This particular design
           p.seq(\foo, 0.25);
                                                                                  decision offers a fruitful logic of polyrhythmic and poly-
   In both cases we have created a synthesizer and a pat-                         metric temporal exploration. The system explicitly af-
tern generator that plays the synth. The latter notation is                       fords this type of musical thinking, which consequently
less prone to mistakes, and for the trained eye the syntax                        limits other types of musical expression. The designers of
actually becomes symbolic through the use of dots,                                the live coding languages discussed in this section are not
camelCase, equals signs, syntax coloring, and brackets                            trying to create a universal solution to musical expres-
with arguments that are differently formatted according                           sion, but rather define limited sets of methods that ex-
to type. This is called ‘secondary notation’ in computer                          plore certain musical themes and constraints.
science parlance, and addresses the cognitive scaffolding                           Dave Griffiths’ Scheme Bricks is a graphical coding
offered by techniques like colorization or indentation [5].                       system of a functional paradigm, and, like Tidal, it offers
                                                                                  a way of creating recursive patterns. Inspired by the MIT
            3. LIVE CODING AS NOTATION                                            Scratch [9] programming system, a graphical visualiza-
Live coding is a real-time performance act and therefore                          tion is built on top of the functional Scheme program-
requires languages that are relatively simple, forgiving in                       ming language. The user can move blocks around and
terms of syntax, and high level. Certain systems allow for                        redefine programs through visual and textual interactions
both high and low level approach to musical making, for                           that are clear to the audience. The colored code blocks are
example SuperCollider, which enables the live coder to                            highlighted when the particular location of the code runs,
design instruments (or synths) whilst playing them at                             giving an additional representational aspect to the code.
another level of instructions (using patterns). Perhaps the
live coding language with the most vertical approach
would be Extempore [6], which is a live coding environ-
ment where the programming language Scheme is used at
the high level to perform and compose music, but another
language – type sensitive and low level, yet keeping the
functional programming principles of Scheme – can be
used for low level, real-time compiled instructions (using
the LLVM compiler). In Extempore, an oscillator, wheth-
er in use or not, can be redesigned and compiled into
bytecode in real-time, hotswapping the code in place.
  However, live performance is stressful and most live
coders come up with their own systems for high-level
control. The goals are typically fast composition cycle,
understandability, novel interaction, but most importantly                        Figure 2 : Scheme Bricks. In simple terms, what we see is the bracket
                                                                                  syntax of Scheme represented as blocks.
to design a system that suits the live coder’s way of
thinking. Below is an introduction of four systems that all                         Scheme Bricks are fruitful for live musical performance
explore a particular way of musical thinking, language                            as patterns can be quickly built up, rearranged, muted,
design, and novel visual representation.                                          paused, etc. The modularity of the system makes it suita-
                                                                                  ble for performances where themes are reintroduced (a
                                                                                  muted block can be plugged into the running graph).
                                                                                    This author created ixi lang in order to explore code as
                                                                                  musical notation [10]. The system is a high level lan-
                                                                                  guage built on top of SuperCollider and has access to all
                                                                                  the functionality of its host. By presenting a coherent set
Figure 1 : A screen shot of Tidal. We see the score written in the quota-         of bespoke ‘ixi lang’ instructions in the form of a nota-
tion marks, with functions applied.
                                                                                  tional interface, the system can be used by novices and
                                                                            135
experienced SuperCollider users alike. The system re-                              A recent live coding environment by Charlie Roberts
moves many of SuperCollider’s complex requirements                               called Gibber [11] takes this secondary notation and
for correct syntax, whilst using its synth definitions and                       visual representation of music further than ixi lang: here
patterns; the original contribution of ixi lang is that it                       we can see elements in the code highlighted when they
creates a mini-language for quick prototyping and think-                         are played: the text flashes, colors change, and font sizes
ing.                                                                             can be changed according to what is happening in the
                                                                                 music. Gibber allows for any textual element to be
                                                                                 mapped to any element in the music. The code becomes a
                                                                                 visualization of its own functionality: equally a prescrip-
                                                                                 tion and description of the musical processes.
                                                                                   Gibber is created in the recent Web Audio API, which
                                                                                 is a JavaScript system for browser-based musical compo-
                                                                                 sition. As such it offers diverse ways of sharing code,
                                                                                 collaborating over networks in real-time or not.
                                                                                   All of the systems above use visual elements as both
                                                                                 primary and secondary notation for musical control. The
                                                                                 notation is prescriptive – aimed at instructing computers
                                                                                 – although elements of secondary notation can represent
                                                                                 information that could be said to be of a descriptive pur-
                                                                                 pose [12]. The four systems have in common the con-
                                                                                 strained set of functions aimed to explore particular mu-
                                                                                 sical ideas. None of them – bar Gibber perhaps – are
Figure 3 : ixi lang Agents are given scores that can be manipulated, and
changed from other code.                                                         aimed at being general audio programming systems, as
                                                                                 the goals are concerned with live coding: real-time com-
  In ixi lang the user creates agents that are assigned per-
                                                                                 position, quick expression, and audience understanding.
cussive, melodic, or concrete scores. The agents can be
controlled from other locations in the code and during                                          4. THE THRENOSCOPE
that process the textual document is automatically rewrit-
ten to reflect what is happening to the agents. This makes                       In the recent development of the Threnoscope system, the
it possible for the coder and the audience to follow how                         author has explored representational notation of live
the code is changing itself and the resulting music. As                          coding. This pertains to the visualization of sound where
code can be rewritten by the system, it also offers the                          audible musical parameters are represented graphically.
possibility of storing the state of the code at any given                        The system is designed to explore microtonality, tunings,
time in the performance. This is done by writing a snap-                         and scales; and in particular how those can be represented
shot with a name: the snapshot can then be recalled at any                       in visual scores aimed at projection for the audience.
time, where running new code is subsequently muted                                  The Threnoscope departs from linear, pattern-based
(and changes color), and agents whose score has changed                          thinking in music and tries to engender the conditions of
return to their state when the snapshot was taken.                               musical stasis through a representation of sound in a
                                                                                 circular interface where space (both physical space and
                                                                                 pitch space) is emphasized, possibly becoming more
                                                                                 important than concerns of time.
                                                                                    The system is object oriented where the sound object –
                                                                                 the ‘drone’ – gets a graphical representation of its state.
                                                                                 This continuous sound can be interacted with through
                                                                                 code, the graphical user interface, MIDI controllers, and
                                                                                 OSC commands, and changes visually depending upon
                                                                                 which parameters are being controlled. The user can also
                                                                                 create ‘machines’ that improvise over a period of time on
                                                                                 specific sets of notes, as defined by the performer. These
                                                                                 machines can be live coded, such that their behavior
                                                                                 changes during their execution. Unlike the code score,
Figure 4 : Gibber. Here textual code can change size, color or font              discussed below, the machines are algorithmic: they are
responding to the music. All user-defined.                                       not intended to be fully defined, but rather to serve as an
                                                                                 unpredictable accompaniment to the live coder.
                                                                           136
                                                                                 the drone is an ASR (Attack, Sustain, Release) envelope,
                                                                                 a note duration can range from a few milliseconds to an
                                                                                 infinite length. Each of the speaker lines could be seen as
                                                                                 a static playhead, where notes either cross it during their
                                                                                 movement or linger above it with continuous sound. A
                                                                                 compositional aspect of the Threnoscope is to treat notes
                                                                                 as continuous objects with states that can be changed
                                                                                 (spatial location, pitch, timbre, amplitude, envelope, etc.)
                                                                                 during its lifetime.
                                                                                    The Threnoscope has been described before both in
                                                                                 terms of musical notation [11] and as a system for im-
                                                                                 provisation [12]. This paper explores further the notation-
                                                                                 al aspects of the system, and the design of the code score.
                                                                           137
manipulated in real-time, much like we are used to with                            started at different points in time. Scores can be stopped
MIDI sequencers or digital audio workstations.                                     at will. Whilst the scores are typically played without any
                                                                                   visual representation, it can be useful to observe the score
                                                                                   graphically. The second method creates the above-
                                                                                   mentioned graphical representation of the score shown in
                                                                                   Figure 7. For a live performance, this can be helpful as it
                                                                                   allows the performer to interact with the score during
                                                                                   execution. The visual representation of the score can also
                                                                                   assist in gaining an overview of a complex piece.
                                                                                      For this author, the code score has been a fruitful and
                                                                                   interesting feature of the system. Using scores for digital
                                                                                   systems aimed at improvisation becomes equivalent to
                                                                                   how instrumentalists incorporate patterns into their motor
                                                                                   memory. The use of code scores question the much bro-
                                                                                   ken unwritten ‘rule’ that a live coding performance
                                                                                   should be coded from scratch. It enables the live coder
Figure 7. A graphical visualization of the code score. When a vertically
lined drone is clicked on, a page with its code appears above the circular
                                                                                   work at a higher level, to listen more attentively to the
interface.                                                                         music (which, in this author’s experience, can be difficult
                                                                                   when writing a complex algorithm), and generally focus
    Most timelines in music software run horizontally
                                                                                   more on the compositional aspects of the performance.
from left to right, but in the Threnoscope the score verti-
cal and runs from top down. This is for various reasons:                                              7. CONCLUSION
firstly, the available screen space left on most display
resolutions when the circular score has taken up the main                          This short paper has discussed domain specific program-
space on the left is a rectangular shape with the length on                        ming languages as notational systems. Live coding sys-
the vertical axis; secondly, when a user clicks on the                             tems are defined as often being idiosyncratic and bespoke
visual representation of the drone, its score pops up in the                       to their authors’ thought processes. The Threnoscope and
textual form of code, and this text runs from top to bot-                          its code score was presented as a solution to certain prob-
tom. It would be difficult to design a system where code                           lems of performance and composition in live coding,
relates to events on a horizontal timeline.                                        namely of delegating activities to actors such as machines
                                                                                   or code scores.
            6. PERFORMING WITH SCORES
                                                                                                      8. REFERENCES
The code score was implemented for two purposes: to
enable small designed temporal patterns to be started at                           [1] N. Collins, A. McLean, J. Rohrhuber, and A. Ward.
any point in a performance: just like jazz improvisers                                 “Live coding in laptop performance.” Organised
                                                                                       Sound, vol 8. n° 3, 2003, pp. 321-330.
often memorize certain musical phrases or licks, the code
score would enable the live coder to pre-compose musical                           [2] T. Magnusson. “Herding Cats: Observing Live
phrases. The second reason for designing the code score                                Coding in the Wild” in Computer Music Journal,
system is to provide a format for composers to write                                   vol. 38 n° 1, 2014, pp. 8-16.
longer pieces for the system, both linear and generative.                          [3] T. Magnusson. “Algorithms as Scores: Coding Live
   The score duration can therefore range from being a                                 Music” in Leonardo Music Journal. vol 21. n° 1,
short single event to hours of activity; it can be started                             2011, pp. 19-23.
and stopped at any point in a performance, and the per-                            [4] C. Kiefer. "Interacting with text and music:
former can improvise on top of it. Scores can include                                  exploring tangible augmentations to the live-coding
other scores. As an example, a performer in the middle of                              interface" in Proceedings of the International
a performance might choose to run a 3-second score that                                Conference for Life Interfaces, 2014.
builds up a certain tonal structure. The code below shows
                                                                                   [5] A. F. Blackwell, and T. R. G. Green. “Notational
the code required to start a score.                                                    systems - the Cognitive Dimensions of Notations
  ~drones.playScore(\myScore, 1) // name of score & time scale                         framework” in J.M. Carroll (Ed.) HCI Models,
  ~drones.showScore(\myScore) // visual display of the score                           Theories      and    Frameworks:     Toward    a
                                                                                       multidisciplinary science. San Francisco: Morgan
   The first method simply plays the score without a
                                                                                       Kaufmann, 2003, pp. 103-134.
graphical representation. This is very flexible, as multiple
scores can be played simultaneously, or the same score
                                                                             138
[6] A. Sorensen, B. Swift, and A. Riddell. “The Many
    Meanings of Live Coding” in Computer Music
    Journal. vol. 38. n° 1, 2014, pp. 65-76.
[7] A. McLean. “Making programming languages to
    dance to: Live coding with Tidal” in Proceedings of
    the 2nd ACM SIGPLAN International Workshop on
    Functional Art, Music, Modelling and Design, 2014.
[8] T. Magnusson. “Designing constraints: composing
    and performing with digital musical systems” in
    Computer Music Journal, vol. 34. no 4., 2010,
    pp. 62-73.
[9] M. Resnick, J. Maloney, A. Monroy-Hernández, N.
    Rusk, E. Eastmond, K. Brennan, A. Millner, E. Ros-
    enbaum, J. Silver, B. S. Silverman, and Y. Kafai.
    “Scratch: Programming for All” in Communications
    of the ACM, vol. 52. no 11, 2009.
[10] T. Magnusson. “ixi lang: a SuperCollider parasite
     for live coding” in International Computer Music
     Conference, 2011.
[11] C. Roberts, and J. Kuchera-Morin. “Gibber: Live
     Coding Audio in the Browser.” in Proceedings of
     the International Computer Music Conference,
     2012, pp. 64-69.
[12] A.F. Blackwell, and T.R.G. Green. “Notational
     systems - the Cognitive Dimensions of Notations
     framework” in J.M. Carroll (Ed.) HCI Models,
     Theories      and    Frameworks:     Toward    a
     multidisciplinary science. San Francisco: Morgan
     Kaufmann, 2003, pp. 103-134.
[13] T. Magnusson. “Scoring with code: composing with
     algorithmic notation” in Organised Sound, vol. 19.
     no. 3, 2014, pp. 268-275.
[14] T. Magnusson. “Improvising with the threnoscope:
     integrating code, hardware, GUI, network, and
     graphic scores” in Proceedings of the New Interfaces
     for Musical Expression Conference, 2014.
                                                            139
             THEMA: A MUSIC NOTATION SOFTWARE PACKAGE WITH
               INTEGRATED AND AUTOMATIC DATA COLLECTION
                                                         Peter McCulloch
                                                        New York University
                                                    [email protected]
                                                                         140
       2. PREVIOUS AND RELATED WORK                                                            3. THEMA
A few projects have used custom software to study compo-               Thema is a music notation software environment, written
sitional process. Otto Laske and Barry Truax used the Ob-              using the Java Music Specification Language’s JScore pack-
server I program to observe children making music with a               age [11], that has been purpose-built for automatic data
synthesizer [4]. Maude Hickey wrote a program to study                 collection. On the surface, Thema is designed to operate in
creativity in children [5]. The QSketcher research project             a manner similar to existing music notation software. Note
at IBM developed an environment for film music compo-                  entry occurs in step-entry mode, optionally with MIDI in-
sition that automatically recorded and organized impro-                put, and most functions in the interface are available via
visational material, and maintained a persistent view of               keyboard shortcuts, including score playback. Selections
its environment. [10]. Recently, Chris Nash created re-                may be made with the mouse, and a variety of commands
ViSiT, a free “tracker”-style audio plugin that captured us-           such as transposition and clipboard operations are avail-
age data [6]. He used the software to collect data from                able. Thema has two different playback modes: cursor
over 1,000 musicians; his is by far the largest sample size            and selection; cursor mode plays the score from the cur-
in this area, but his data set does not include the music that         rent cursor location, while selection mode plays back only
his subjects created.                                                  the selected material. Like ENP [12], Thema can make
  In considering the use of quantitative data for studying             use of non-contiguous selections in the score, and this in-
compositional process, it is helpful to consider existing in-          cludes playback operations. Thema’s musical representa-
frastructure for storing data. Standardized open file for-             tion is relatively limited relative to other software in the
mats have been a boon to music informatics researchers                 field such as Bach [13] and ENP. It cannot, for instance,
as they allow musical data to be examined independently                represent nested tuplets, and it does not support breakpoint
of the program that created the data. This is helpful since            function notation or proportional notation in editing.
designing music notation software is a time-intensive task,              Thema, like JScore, allows the user to operate on rhythm
and standard file formats provide some degree of interop-              in relative units via doubling and halving operations, and
erability between programs. At present, however, there is              this works for both selections as well as for setting the cur-
no standard for how a music notation program should op-                sor’s duration. This latter aspect is useful in that it does not
erate, and that also means that standard music file formats            require the composer to remember specific key mappings
such as MusicXML include very little, if any, information              for durations, though those are also available. In step-entry
outside that already present in the score such as how the              mode, the user may also select a rhythmic motive and then
composer is using the program or improvisational MIDI                  use its durations and ties as a source sequence for a new
data. Additionally, our ability to understand how multiple             stream of notes, as shown in figure 1. This simplifies the
versions of a score are related depends on effective com-              entry of repetitive figures such as dotted eighth and six-
parison algorithms. Though comparison works for simple                 teenth note pairs. Additionally, the composer may advance
additions and deletions, it becomes less useful as the com-            to an arbitrary position within the sequence or reset to the
poser’s actions become more dispersed across time.                     beginning, and this allows for greater flexibility in apply-
  As an alternative to an ad hoc approach to data collection           ing the current sequence without requiring changes to its
which combines multiple data formats such as MusicXML                  content; this can be useful, for example, in creating rhyth-
and SMF and exists separately from the music software,                 mic ostinato figures in fluctuating meters.
there is a good argument to be made for an integrated data
collection system which operates from within the software
used to create the music. Such a system would have a
greater understanding of how its constituent elements re-
late and would be able to integrate the information it col-
lects into its operation. This capability could also be ma-
terially useful to composers, particularly in its ability to
bridge the gap between improvisational development and
transcription. There are certainly drawbacks to such a sys-
tem, most notably, in that it ties the composer’s work to a
particular software. Nevertheless, it allows access to data
which is not well-served by existing methods; this data
                                                                                Figure 1. Step entry with rhythmic sequence
may provide insight outside of that available in score and
MIDI data.
                                                                 141
3.1 Storage                                                                      bles the score from the desired version. This is different
                                                                                 than a diff-based approach because the state of the en-
By design, Thema focuses on automatically and transpar-
                                                                                 tire score across time is visible to search functions without
ently capturing low-level data at a fine time granularity.
                                                                                 the need for derivation. Accessing earlier versions of a
Where possible, data is stored as it is captured, and all
                                                                                 score is as efficient as loading the current score, and it is
entries are stored with time stamps at millisecond resolu-
                                                                                 simple to make comparisons across versions. When undo-
tion. In order to enable frequent storage, Thema models the
                                                                                 ing commands or reverting to a previous state of the score,
score, as well as the state of the program, as a collection of
                                                                                 the state of the score at the desired moment is loaded from
tables within a relational database, and only stores mate-
                                                                                 the database and then made current. A detailed description
rial that has changed over subsequent states. The database
                                                                                 of this mechanism is provided in [14, p.85-109]; one in-
maintains previous versions of edited items so that all past
                                                                                 teresting feature of this design is that though it appears to
states of an item are reachable. In order to easily iden-
                                                                                 operate like a conventional undo-redo stack, all past states
tify multiple versions of an item across time, each item
                                                                                 are accessible.
is tagged with a unique, automatically generated perma-
nent identifier which is consistent across all versions of the
                                                                                 3.3 Processes
item.
  Storage in Thema is tightly integrated into to the work-                       Processes are the primary unit of work within Thema. Any
ings of the program, and it occurs as the result of actions                      action that the composer initiates within the program cre-
by the composer, instead of at an arbitrary time interval.                       ates a timestamped entry in the process table, and each en-
This makes it easier to discern the composer’s actions in                        try represents a state in the score or the environment, with
the data stream, since any entry in the database is present                      the exception of MIDI data, which is stored independently
as the direct result of the composer’s actions; it also pre-                     of process information. This distinction is made because
vents multiple actions from being condensed into a single                        the volume of MIDI data is considerable, and may happen
entry, as might occur when saving at a particular time inter-                    in parallel with other actions. MIDI data is recorded with
val. A command identifies elements which may have been                           a time stamp, and may be combined with the process table
intentionally changed, and the program proceeds to discern                       data via an SQL join.
whether those elements were actually changed, as well as                           Thema separates processes into two categories: score and
any changes to surrounding elements which may have oc-                           environment. Score processes alter the content of the score
curred as a result. For example, changing the duration of                        whereas environment actions, such as adjusting the view-
the first note in the measure has the effect of changing the                     able area, playing back the score, and making drag selec-
starting times of subsequent notes within the measure.                           tions with the mouse, do not. With score processes, the
                                                                                 program also records whether or not a command had any
3.2 Data Representation                                                          effect, e.g., the user attempting to transpose a rest or make
                                                                                 a deletion when the cursor is in a blank measure. This al-
Thema represents the structure of the score in a straight-
                                                                                 lows the program to skip over commands which had no im-
forward, hierarchical fashion: a score contains measures,
                                                                                 pact when undoing, and also makes it possible to remove
measures contain staves, staves contain tracks, and tracks
                                                                                 empty operations from consideration for analysis.
contain notes. Following normalized database practices,
objects within that hierarchy only reference the class of
objects immediately above them in the hierarchy, with the                        3.4 Separation of Identity from State
exception that all objects maintain a reference to their con-                    Whether using qualitative or quantitative data, working with
taining score. 2 For example, when storing a note in the                         multiple versions of a single score often can lead to refer-
database, a reference is stored to its track’s identifier, but                   ence problems. The score is not static over time in these
not to its containing staff or measure. This insulates lower                     situations, but most of our musical terminology for label-
level objects from being affected by changes in higher-                          ing material depends on it being so. For example, a label
level objects such as the insertion of a measure earlier in                      such as “the first A5 in measure 7” is tied inextricably to
the piece.                                                                       its current state. If the note is transformed, or measures
  With a score changing over time, the amount of data that                       are added in front of it, it is no longer an accurate descrip-
could be stored is potentially large. It would be inefficient                    tor. Adding timing information makes it easier to find the
to store the entire document for a small change, so Thema                        particular material as it existed at the moment, but it does
stores only objects that have changed. On load, it reassem-                      not provide a sense of its identity across time. Labeling
   2 A database can contain multiple scores, and by explicitly filtering         systems may be used where the software supports it, but
based on the score, queries run an order of magnitude faster.                    these depend on the composer or the researcher to main-
                                                                           142
tain consistency, as indicated by David Collins notes in his           records the context of the program, including the location
long-term study of compositional process [1].                          of the cursor, any selections in the score, the current state
  Rather than identifying material in the score based on               of playback, and so forth.
mutable state, it is a better approach to use a permanent                Additionally, in storing objects, Thema makes a distinc-
identifier that is decoupled from state, e.g., “note #1273”.           tion between directly edited objects and objects that have
In relational database design, synthetic keys are preferred            changed state as a result of edits to other objects. For ex-
over natural keys for this same reason in that they preserve           ample, a change in the duration of a note appearing at the
the unique identity of a row even when its values change.              beginning of the measure would affect the state of subse-
By using a synthetic key to identify notes and other ob-               quent notes; the first note would be considered to be di-
jects, continuity across time is preserved. It also makes it           rectly edited, while the other notes would be marked as in-
simple to compare different versions of the same passage               directly edited in the database. Similarly, deleting a mea-
which are separated by a large amount of time. This does               sure causes the measure numbers of all subsequent mea-
not preclude searches based on content or comparison, but              sures to be decremented. By indicating the target of the
it reduces dependency on comparison. It is worth noting,               operation, Thema reduces ripple effects in the data and pro-
however, that this approach is only practical in a situation           vides a more accurate picture of the composer’s actions.
where the program automatically manages this informa-
tion.                                                                  3.6 Model Objects
  Thema identifies material by using unique identification             Latency can be a challenge for object-oriented applications
numbers. Each object in the score has a unique, permanent              which use databases for storage. If an object changes be-
identifier as well as a version number. The first value is             fore it is correctly stored, the values in the database could
static and identifies an object across time; it is also guar-          become inconsistent with the program values. At the same
anteed to be unique across object classes, e.g., for a note            time, it is important that the user interface for the program
with the identifier #1273, no other objects in the database            is responsive to its user, so it is also impractical to delay
will share the same identifier, even across scores. The sec-           processing user input while storage is occuring. To ad-
ond value indicates the specific version of the object within          dress this problem, Thema uses immutable data objects
the database table; each version occupies a row in the ta-             between the application logic and the database. For ob-
ble. The permanent identifier decouples an object’s iden-              ject in the score or the environment, the program maintains
tity from its state which allows searches to be conducted              an immutable data model of each object’s current state,
on the basis of identity rather than musical content. Search           e.g., a Note object contains a reference to a NoteModel
based on comparison is certainly still possible, but it is not         object. 3 When an object may have changed as the re-
necessary in order to locate material across time. This is             sult of a command, a new model is constructed and com-
useful because identity-based searches will typically run              pared against the previous state; if different, the new model
several orders of magnitude more rapidly than comparison-              is added to the storage queue and replaces the previous
based ones; for example, it is much simpler to find the his-           model. Though the construction of models increases mem-
tory of a particular group of notes by searching for rows              ory requirements, it also allows storage to safely proceed
with corresponding identifiers than it is to compare thou-             in a separate thread from the user interface, ensuring re-
sands of iterations of a score.                                        sponsiveness while maintaining data integrity. Because the
                                                                       model objects cannot change once created, they may also
3.5 Attribution and Context
                                                                       be safely cached in memory in order to accelerate loading
When storing data, it is useful not only to be able to iden-           when undoing or redoing actions in the score. Each model
tify changed material but also to know how the changes                 has two fields titled process and kill process. These
were effected. For example, a “cut” operation is identical             fields serve to mark the lifespan of the specific model. Fig-
to a “clear” operation in terms of the difference between              ure 3 and table 1 show an example of the lifecycle of mod-
successive states in the score; in the case of a “cut” com-            els in Thema.
mand, however, it is likely that the material may reemerge
as the result of a “paste” command, and it is helpful to                                            4. DATASET
identify this relationship as it indicates a larger cognitive
process. In this example, Thema will not only indicate the             In order to establish a dataset for studying compositional
command type during storage, but it will also store a set of           process data collected in Thema, a study was conducted
parent-child relationships between the source and destina-             at New York University with ten graduate-level composers
tion materials so that the link between the two states of the          creating piano pieces using the software. Subjects were
score—however distant—is maintained. The software also                   3   For a useful discussion of the virtues of immutability, see [15].
                                                                 143
                                                                                                       entity      id    process     kill process
                                                                                                        127       439      239            245
                                                                                                        133       440      239            247
                                                                                                        127       441      245            253
                                                                                                        145       442      245            null
                    Figure 2. Entities, models, and the database                                        133       443      247            250
                                                                                                        127       444      253            null
                   Process   #239           #245       #247      #250 #253
                                      Figure 3. Timeline
                                                                                                                5. VISUALIZATIONS
                                                                                            Thema contains a suite of visualizations in a variety of
introduced to the software via a tutorial session, and were
                                                                                            formats for examining compositional data. These visual-
then asked to notate a brief excerpt from Bartok’s Mikrokos-
                                                                                            izations can be synchronized together via a central time
mos; the excerpt was selected because it would require
                                                                                            slider. For example, one window might contain the score
users to perform a variety of tasks, handling time-signature
                                                                                            at the time indicated by the slider, while another window
changes, articulations, and multiple voices within a staff.
                                                                                            displays the structure of the score over the course of a two
Following this, the remainder of the four-hour session was
                                                                                            hour sliding window, and a third window contains a score
spent composing a short piano piece for an intermediate-
                                                                                            which displays incoming pitches from the MIDI keyboard
level performer in a style of the composer’s choosing. This
                                                                                            in a two minute window. Each window can also contain
controlled setting ensured that participants had access to
                                                                                            multiple graphical overlays, such as, for example, show-
a full-size MIDI keyboard and were able to receive techni-
                                                                                            ing the location in the score and duration of playback su-
cal support if they had questions about the software. While
                                                                                            perimposed on the long-term structural view. Thema also
this arrangement is not ideal from the standpoint of natural-
                                                                                            features an API for developing user-defined graphical over-
ism, it ensured that the data was captured reliably and that
                                                                                            lays.
composers were able to use the program, and the knowl-
                                                                                              The score overlay section of the API allows programmers
edge gained will allow for future, less-restrictive studies to
                                                                                            to access the drawing subroutines for notes in the score.
proceed. Though the composers had relatively little time
                                                                                            Figure 4 shows a heat map of the composer’s edit activity
to work with the software, all of the composers were able
                                                                                            superimposed on the score.
to complete the study. At the end of the session, the com-
posers provided a segmentation of the work, indicating ma-
jor sections, as well as any smaller subdivisions. The com-
posers were compensated for their time and agreed to re-
lease their work and data under a Creative Commons li-
cense with the option of being attributed for their work or
remaining anonymous. 4
                                                                                                                Figure 4. Heatmap Overlay
  In addition to capturing quantitative data, Thema also
recorded screen captures for every entry in the process ta-                                   In figure 5, the arrows between pairs of notes indicate
ble. The bounding-box coordinates for currently visible                                     pairs of notes that were edited in close time proximity to
notes, dynamics, and articulations were also stored into                                    each other. The weight of the line is proportional to the
a table in the database. This makes it possible to create                                   number of times they were edited as well as how closely in
graphical overlays on the score images without having to                                    time they were edited, with simultaneous edits producing
use the program itself, and provides a simple means of                                      thicker lines. As can be seen, the notes in the first two
rapidly browsing through past states of the score. Non-                                     measures are densely connected to each other; they are also
linear browsing and montaging can be realized via database                                  connected to the previous (unseen) measure. The notes in
queries, e.g., “select all activity within a two-minute win-                                the last measure, however, are not connected to the notes in
  4 The terms of the license are available at https:
                                                                                            the prior measures, indicating that they were never edited
//creativecommons.org/licenses/by-nc-sa/4.0/                                                in close time proximity to the notes in the second measure.
                                                                                      144
This location is also one of the major boundaries in the                  pp. 239–256, 2007.
score indicated by the composer.
                                                                      [2] ——, “A synthesis process model of creative thinking
                                                                          in music composition,” Psychology of Music, vol. 33,
                                                                          no. 2, pp. 193–216, 2005.
[1] D. Collins, “Real-time tracking of the creative music         [13] A. Agostini and D. Ghisi, “Bach: an environment
    composition process,” Digital Creativity, vol. 18, no. 4,          for computer-aided composition in max,” in
                                                                145
     Proceedings of the International Computer Music
     Conference. Ljubljana, Slovenia: ICMC, 2012,
     pp. 373–378. [Online]. Available: http://quod.
     lib.umich.edu/cgi/p/pod/dod-idx/
     bach-an-environment-for-computer-aided-composition-in-max.
     pdf?c=icmc;idno=bbp2372.2012.068
                                                               146
                             STANDARD MUSIC FONT LAYOUT (SMuFL)
                                                                           147
considering the needs of the in-development application,             is made available on a commercial basis or under a
provided the impetus to create a new standard, with the              permissive or free software license.
following goals identified from the outset:                            Accordingly, Steinberg has released SMuFL under the
                                                                     MIT License,6 which is a permissive free software license
2.1 Extensible by design                                             that allows reuse within both proprietary and open source
The existing Unicode Musical Symbols range is a fixed                software.
set of 220 characters in a fixed range of 256 code points
at U+1D100–U+1D1FF. This range is not easily                         2.4 Practical and useful
extensible, though of course it would be possible for one            Although it is impossible to say with certainty why the
or more non-contiguous supplemental ranges to be added               Unicode Musical Symbols range has failed to gain
to future versions of Unicode.                                       support among software developers and font designers, it
  Sonata pre-dates the introduction of Unicode: in                   is reasonable to assume that the range did not sufficiently
common with other PostScript Type 1 fonts of its age, it             solve the existing problems with the ad hoc Sonata-
uses an 8-bit encoding that limits its repertoire of glyphs          compatible approach, perhaps most crucially the lack of
to a maximum of 256 within a single font. Fonts that                 extensibility afforded by the limit of 220 characters,
broadly follow a Sonata-compatible layout are therefore              which represented only a very modest expansion of the
likewise limited to a maximum of 256 glyphs, and as                  176 characters present in Sonata.
their developers have needed to further expand their                   A new standard should not only be extensible, but
repertoire of characters, they have unilaterally added               should be developed with the practical needs of software
separate fonts, with no agreement about which characters             developers and font designers as the top priority,
should be included at which code points.                             including providing detailed technical guidelines on how
  A new standard should be extensible by design, such                to solve some of the issues inherent in representing music
that even if the repertoire of characters needs to expand,           notation using a combination of glyphs drawn from music
there is both a procedure for ratifying the inclusion of             fonts and drawn primitive shapes (stroked lines, filled
new characters into the standard, and a means for                    rectangles, curves, etc.).
individual font designers or software developers to add
glyphs for their own private use in a way that does not              2.5 Facilitate easier interchange
break the standard for other users.                                  As existing music fonts have been developed in isolation
                                                                     by independent software developers and font developers,
2.2 Take advantage of modern font technologies                       despite broad intent to make it possible for end users of
The development of the Unicode standard and the                      scoring programs to use a variety of fonts, including
OpenType font specification, and their adoption by                   those designed for other applications, in practice the level
operating system, software, and font developers, are both            of compatibility between fonts and scoring programs is
enormously important: Unicode provides categorization                rather low.
and structure to the world’s language systems, while                     A comparison of the repertoire of glyphs in Sonata,
OpenType enables the development of more advanced                    Petrucci, and Opus shows that only 69 of 176 glyphs in
fonts with effectively unlimited glyph repertoires and               Sonata are also present in both Petrucci and Opus; a
sophisticated glyph substitution and positioning features.           further 38 glyphs are present in Sonata and Petrucci, but
   A new standard should enable software developers and              not Opus; and a further 5 glyphs are present in Petrucci
font designers to build software that takes advantage of             and Opus, but not Sonata; a further 59 glyphs in Sonata
these features, without tying the standard to a specific set         are present in neither Opus nor Petrucci.
of technologies, so that it is as broadly applicable and                 Furthermore, there is no practical way for an end user
resistant to future obsolescence as is practical to achieve.         to know in advance of attempting to use a different font
                                                                     whether or not a given range of characters is
2.3 Open license                                                     implemented in that font, and when transferring
In order to minimize the number of obstacles for software            documents created in software between systems there is
developers and font designers to adopt the new standard,             little guarantee that the software can translate the required
it should be free of onerous software licensing terms.               glyph from one font to another.
   A new standard should be released under a permissive,                 A new standard should improve the compatibility of
open license that both protects Steinberg’s copyright in             music fonts between different systems by providing not
the standard, but makes it free for anybody to use in                only an agreed mapping of characters to specific code
whole or in part in any project, whether that project itself
                                                                       6
                                                                           See http://opensource.org/licenses/MIT
                                                               148
points, but also a means for font designers to describe              also provides a useful categorization of thousands of
programmatically    the    repertoire   of    characters             symbols used in Western music notation.
implemented in a given font.
                                                                     4.1 Character repertoire and organization
2.6 Build community support                                          The initial public release of SMuFL, version 0.4,
The range of symbols used in Western music notation is               included around 800 characters. By the time of the
so deep and broad that it is difficult for any individual            release of version 1.0, in June 2014, the total number of
person or small group to have sufficient knowledge to                characters included had grown to nearly 2400, organized
correctly identify and categorize the characters.                    into 104 groups.
Furthermore, without broad support among software                      SMuFL makes use of the Private Use Area within
developers and font designers, any new standard is                   Unicode’s Basic Multilingual Plane (code points from
destined to languish unused.                                         U+E000–U+F8FF). The Unicode standard includes three
   A new standard should be developed in the open,                   distinct Private Use Areas, which are not assigned
inviting interested parties to contribute ideas and                  characters by the Unicode Consortium so that they may
discussion to the development of the repertoire of                   be used by third parties to define their own characters
characters,    their   categorization,   and    technical            without conflicting with Unicode Consortium
recommendations about font design, glyph metrics, and                assignments.
glyph registration.                                                    SMuFL is a superset of the Unicode Musical Symbols
                                                                     range, and it is recommended that common characters are
     3. NON-GOALS FOR A NEW STANDARD                                 included both at code points in the Private Use Area as
At the outset of the project, it was determined that, in the         defined in SMuFL and in the Unicode Musical Symbols
short- to medium-term at least, targeting ratification of            range.
the new standard by the Unicode Consortium in order to                 The groups of characters within SMuFL are based on
broaden the range of musical symbols encoded by                      the groupings defined by Perry Roland in the Unicode
Unicode was not a goal of the project. Developing the                Musical Symbols range, but with finer granularity. There
standard independently, away from the more rigorous                  are currently 108 groups, proceeding roughly in order
requirements of the proposal and review process, gives               from least to most idiomatic, i.e. specific to particular
greater agility and faster iteration as new requirements             instruments, types of music, or historical periods. The
emerge.                                                              grouping has no significance other than acting as an
  Initially it was also determined that attempting to                attempt to provide an overview of the included
develop a set of recommendations for fonts to be used                characters.
inline with text fonts in word processing or page layout               Groups are assigned code points in multiples of 16.
software would be too much work to undertake right                   Room for future expansion has, where possible, been left
away, in addition to the core goal of developing                     in each group, so code points are not contiguous. The
recommendations for fonts to be used in specialized                  code point of each character in SMuFL 1.0 is intended to
music notation software. However, after the launch of the            be immutable, and likewise every character has a
new standard at the Music Encoding Conference in                     canonical name, also intended to be immutable. Since the
Mainz, Germany in May 2013, the members of the                       release of SMuFL 1.0, a few additional characters have
nascent community identified this as a high priority                 already been identified that should be added to groups
activity, and the development of guidelines for fonts to be          that were already fully populated, and, in common with
used in text-based applications was added as a                       the approach taken by the Unicode Consortium, new
requirement for the first stable release of the new                  supplemental groups have been added at the end of the
standard.                                                            list of existing groups to accommodate these additions.
The Standard Music Font Layout, or SMuFL                             No formal criteria have been developed for whether or
(pronounced “smoofle”), provides both a standard way of              not a given character is suitable for inclusion in SMuFL.
mapping music symbols to the Private Use Area of                     Members of the community make proposals for changes
Unicode’s Basic Multilingual Plane, and a detailed set of            and additions to the repertoire of characters, giving rise to
guidelines for how music fonts should be built.                      public discussion, and once consensus is reached, those
  As a consequence of the joint effort of the community              changes are made in the next suitable revision.
that has arisen around the development of the standard, it             In general a character is accepted if it is already in
                                                                     widespread use: although composers and scholars invent
                                                               149
new symbols all the time, such a symbol can only be                  4.4 SMuFL metadata
included in SMuFL if there is broad community support.               To aid software developers in implementing SMuFL-
                                                                     compliant fonts, three support files in JSON format [1]
4.3 Recommended and optional glyphs
                                                                     are available.
One of the aims of SMuFL is to make it as simple as                    glyphnames.json maps code points to canonical glyph
possible for developers both of fonts and of scoring                 names, which by convention use lower camel case, a
software to implement support for a wide range of                    convenient format for most programming languages. The
musical symbols. Although modern font technologies                   file is keyed using the glyph names, with the SMuFL
such as OpenType enable a great deal of sophistication in            code point provided as the value for the codepoint key,
automatic substitution features, applications that wish to           and the Unicode Musical Symbols range code point (if
use SMuFL-compliant fonts are not obliged to support                 applicable) provided as the value for the
advanced OpenType features.                                          alternateCodepoint key. The description key contains
   The basic requirements for the use of SMuFL-                      the glyph’s description.
compliant fonts are the ability to access characters by                classes.json groups glyphs together into classes, so that
their Unicode code point, to measure glyphs, and to scale            software developers can handle similar glyphs (e.g.
them (e.g. by drawing the font at different point sizes). If         noteheads, clefs, flags, etc.) in a similar fashion. Glyphs
applications are able to access OpenType features such as            are listed within their classes using the names specified in
stylistic sets and ligatures, then additional functionality          glyphnames.json. Not all glyphs are contained within
may be enabled.                                                      classes, and the same glyph can appear in multiple
   However, all glyphs that can be accessed via OpenType             classes.
features are also accessible via an explicit code point. For           ranges.json provides information about the way glyphs
example, a stylistic alternate for the sharp accidental              are presented in discrete groups in this specification. This
designed to have a clearer appearance when reproduced                file uses a unique identifier for each group as the primary
at a small size can be accessed as a stylistic alternate for         key, and within each structure the description specifies
the character accidentalSharp, but also by way of its                the human-readable range name, glyphs is an array
explicit code point, which will be in the range U+F400–              listing the canonical names of the glyphs contained
U+F8FF.                                                              within the group, and the range_start and range_end
   Because optional glyphs for ligatures, stylistic                  key/value pairs specify the first and last code point
alternates, etc. are not required, and different font                allocated to this range respectively.
developers may choose to provide different sets (e.g.
different sets of glyphs whose designs are optimized for             4.5 Font-specific metadata
drawing at different optical sizes), SMuFL does not make             It is further recommended that SMuFL-compliant fonts
any specific recommendations for how these glyphs                    also contain font-specific metadata JSON files. The
should be assigned explicit code points, except that they            metadata file allows the designer to provide information
must be within the range U+F400–U+F8FF, which is                     that cannot easily (or in some cases at all) be encoded
reserved for this purpose and for any other private use              within or retrieved from the font software itself, including
required by font or application developers.                          recommendations for how to draw the elements of music
   In summary, recommended glyphs are encoded from                   notation not provided directly by the font itself (such as
U+E000, with a nominal upper limit of U+F3FF (a total                staff lines, barlines, hairpins, etc.) in a manner
of 5120 possible glyphs), while optional glyphs                      complementary to the design of the font, and important
(ligatures, stylistic alternates, etc.) are encoded from             glyph-specific metrics, such as the precise coordinates at
U+F400, with a nominal upper limit of U+F8FF (a total                which a stem should connect to a notehead.
of 1280 possible glyphs).                                              Glyph names may be supplied either using their
   In order for a font to be considered SMuFL-compliant,             Unicode code point or their canonical glyph name (as
it should implement as many of the recommended glyphs                defined in the glyphnames.json file). Measurements are
as are appropriate for the intended use of the font, at the          specified in staff spaces, using floating point numbers to
specified code points. Fonts need not implement every                any desired level of precision.
recommended glyph, and need not implement any                          The only mandatory values are the font’s name and
optional glyphs, in order to be considered SMuFL-                    version number. All other key/value pairs are optional.
compliant.                                                             The engravingDefaults structure contains key/value
                                                                     pairs defining recommended defaults for line widths etc.
                                                                       The glyphsWithAnchors structure contains a structure
                                                                     for each glyph for which metadata is supplied, with the
                                                               150
canonical glyph name or its Unicode code point as the                       the horizontal lines bisecting the blue boxes show the
key, and is discussed in more detail below.                                 origin for each glyph, i.e. y=0.
  The glyphsWithAlternates structure contains a list of                       • The shaded red boxes show the locations of the glyph
the glyphs in the font for which stylistic alternates are                   attachment points, as specified in the font metadata JSON
provided, together with their name and code point.                          file.
Applications that cannot access advanced font features                        • The shaded area on the down-stem note shows the
like OpenType stylistic alternates can instead determine                    amount by which a stem of standard length (i.e. the
the presence of an alternate for a given glyph, and its                     unfilled portion of the stem) should be extended in order
code point, using this data.                                                to ensure good on-screen appearance at all zoom levels.
  The glyphBBoxes structure contains information about                        Note that the stemUpSE attachment point corresponds
the actual bounding box for each glyph. The glyph                           to the bottom right-hand (or south-east) corner of the
bounding box is defined as the smallest rectangle that                      stem, while stemDownNW corresponds to the top left-
encloses every part of the glyph’s path, and is described                   hand (or north-west) corner of the stem. Likewise, for
as a pair of coordinates for the bottom-left (or southwest)                 correct alignment, the flag glyphs must always be aligned
and top-right (or northeast) corners of the rectangle,                      precisely to the left-hand side of the stem, with the glyph
expressed staff spaces to any required degree of                            origin positioned vertically at the end of the normal stem
precision, relative to the glyph origin.                                    length.
  The ligatures structure contains a list of ligatures
defined in the font. Applications that cannot access                        4.6 Glyph registration and metrics recommendations
advanced font features like OpenType ligatures can                          In addition to providing a standard approach to how
instead determine the presence of a ligature that joins                     musical symbols should be assigned to Unicode code
together a number of recommended glyphs, and its code                       points, SMuFL also aims to provide two sets of
point, using this data.                                                     guidelines for the metrics and glyph registration,
  The sets structure contains a list of stylistic sets defined              addressing the two most common use cases for fonts that
in the font. Applications that cannot access advanced font                  contain musical symbols, i.e. use within dedicated
features like OpenType stylistic sets can instead                           scoring applications, and use within text-based
determine the presence of sets in a font, the purpose of                    applications (such as a word processors, desktop
each set, and the name and code point of each glyph in                      publishers, web pages, etc.).
each set, using this data.                                                    Since it is helpful for scoring applications that all
                                                                            symbols in a font be scaled relative to each other as if
4.5.1 Example of how font-specific metadata is used
                                                                            drawn on a staff of a particular size, and conversely it is
Figure 1 shows how font-specific metadata may be used                       helpful for musical symbols to be drawn in-line with text
in conjunction with the conventions of glyph registration                   to be scaled relative to the letterforms with which the
                                                                            musical symbols are paired, in general a single font
                                                                            cannot address these two use cases: the required metrics
                                                                            and relative scaling of glyphs are incompatible.
                                                                              Therefore, it is recommended that font developers make
                                                                            clear whether a given font is intended for use by scoring
                                                                            applications or by text-based applications by appending
                                                                            “Text” to the name of the font intended for text-based
                                                                            applications; for example, “Bravura” is intended for use
                                                                            by scoring applications, and “Bravura Text” is intended
                                                                            for use by text-based applications (or indeed for mixing
                                                                            musical symbols with free text within a scoring
                                                                            application).
Figure 1 : Diagram illustrating how points defined in font-specific           The complete guidelines for key font metrics and glyph
metadata can be used by scoring software.
                                                                            registration are too detailed to reproduce here, so they can
                                                                            be read in full in the SMuFL specification.7 Those
to construct two notes: an up-stem 16th note                                guidelines that apply to the font as a whole, rather than
(semiquaver), and a down-stem 32nd (demisemiquaver).                        specific groups of glyphs, are reproduced below.
  • The horizontal grey lines denote staff lines, for scale.
  • The dashed boxes show glyph bounding boxes, with
the left-hand side of the box corresponding to x=0, while
                                                                              7
                                                                                  See http://www.smufl.org/download
                                                                      151
4.6.1 Guidelines for fonts for scoring applications                   the middle of the bottom staff line of a nominal five-line
Dividing the em in four provides an analogue for a five-              staff, and y = 0.8 em represents the middle of the top staff
line staff: if a font uses 1000 upm (design units per em),            line of that same five-line staff.
as is conventional for a PostScript font, one staff space is            Unless otherwise stated, all glyphs should be drawn at a
equal to 250 design units; if a font uses 2048 upm, as is             scale consistent with the key measurement that one staff
conventional for a TrueType font, one staff space is equal            space = 0.2 em.
to 512 design units.                                                    Unless otherwise stated, all glyphs shall be horizontally
  The origin (bottom left corner of the em square, i.e.               registered so that their leftmost point coincides with x =
x = 0 and y = 0 in font design space) therefore represents            0.
the middle of the bottom staff line of a nominal five-line              Unless otherwise stated, all glyphs shall have zero-
staff, and y = 1 em represents the middle of the top staff            width side bearings, i.e. no blank space to the left or right
line of that same five-line staff.                                    of the glyph.
  All glyphs should be drawn at a scale consistent with                 Staff line and leger line glyphs should have an advance
the key measurement that one staff space = 0.25 em.                   width of zero, so that other glyphs can be drawn on top of
  Unless otherwise stated, all glyphs shall be horizontally           them easily.
registered so that their leftmost point coincides with
x = 0.                                                                                  5. REFERENCE FONT
  Unless otherwise stated, all glyphs shall have zero-                To demonstrate all of the key concepts of SMuFL, a
width side bearings, i.e. no blank space to the left or right         reference font has been developed. The font family is
of the glyph.                                                         called Bravura, and consists of two fonts: Bravura, which
                                                                      is intended for use in scoring applications; and Bravura
4.6.2 Guidelines for fonts for text-based applications                Text, which is intended for use in text-based applications.
Upper case letters in a text font do not typically occupy                The word Bravura comes from the Italian word for
the whole height of the em square: instead, they typically            “cleverness”, and also, of course, has a meaning in music,
occupy around 75–80% of the height of the em square,                  referring to a virtuosic passage or performance; both of
with the key metrics for ascender and caps height both                these associations are quite apt for the font. From an
falling within this range. In order for the line spacing of a         aesthetic perspective, Bravura is somewhat bolder than
font containing music characters to be equivalent to that             most other music fonts, with few sharp corners on any of
of a text font, its key metrics must match, i.e. the                  the glyphs, mimicking the appearance of traditionally-
ascender, caps height and descender must be very similar.             printed music, where ink fills in slightly around the edges
Glyphs with unusually large ascenders and descenders                  of symbols, and the metal punches used in plate
(such as notes of short duration with multiple flags)                 engraving lose their sharp edges after many uses. A short
should not be scaled individually in order to fit within the          musical example set in Bravura is shown below (Figure
ascender height, as they will not then fit with the other             2).
glyphs at the same point size; however, the behavior of                  Steinberg has released the Bravura fonts under the SIL
glyphs that extend beyond the font’s ascender and                     Open Font License [2]. Bravura is free to download, and
descender metrics is highly variable between different                can be used for any purpose, including bundling it with
applications.                                                         other software, embedding it in documents, or even using
  Leading on from the premise that a SMuFL-compliant                  it as the basis for a new font. The only limitations placed
font for text-based applications should use metrics                   on its use are that: it cannot be sold on its own; any
compatible with regular text fonts, specific guidelines are           derivative font cannot be called “Bravura” or contain
as follows:                                                           “Bravura” in its name; and any derivative font must be
  Dividing 80% of the height of the em in four provides               released under the same permissive license as Bravura
an analogue for a five-line staff. If a font uses 1000 upm            itself.
(design units per em), as is conventional for a PostScript
font, the height of a five-line staff is 800 design units, or
0.8em; therefore, one staff space height is 200 design
units, or 0.2 em. If a font uses 2048 upm, as is
conventional for a TrueType font, the height of a five-line
staff is 1640 design units, and one staff space is 410
design units.                                                                 Figure 2. Example of the Bravura font.
  The origin (bottom left corner of the em square, i.e.
x = 0 and y = 0 in font design space) therefore represents
                                                                152
    6. IMPLEMENTATION CASE STUDY: THE                                  November 2.0 project.9 For SMuFL-scaled font projects,
               NOVEMBER FONT                                           it is impractical to create those metadata manually, and,
Unlike designers of text fonts, music font designers have
historically had great freedom, which has been both a
blessing and a curse. Before SMuFL, while there was
some common sense about what the kernel of music
symbols should be (clefs, noteheads, accidentals, etc.),
the actual position of characters in the font, their naming
(though there was generally none provided), and the
addition of rarer symbols beyond the basic set was left up                Figure 3. Example of the November 2.0 font.
to the designer’s imagination and to some specific
                                                                       to make the design workflow even better, one can invent
requirements of the target music notation software.
                                                                       sophisticated tools, for instance to compare the font being
  Things are changing for the font designer with SMuFL
                                                                       crafted with the reference font, Bravura. All of these
as its main goal is to address the issues of symbol
                                                                       considerations change the font development workflow
position, naming and repertoire in a universal way.
                                                                       deeply.
SMuFL is a great source of inspiration for the designer –
                                                                         November 2.0, released in February 2015, now has over
surely one of its benefits – but it also imposes new
                                                                       1100 characters, with about 80% of them coming from
constraints and requirements, and leads to a more
                                                                       the SMuFL specifications, and is the first commercially-
demanding design workflow.
                                                                       released font to comply with SMuFL. A short musical
6.1 The November Font – Summary                                        example set in November 2.0 is shown below (Figure 3).
The November music font was designed in 1999                           6.3 Compatibility with existing scoring software
specifically for the software Finale, and its repertoire of
                                                                       Unlike the font Bravura, which for now has largely
330 characters, spread over two font files, ranging
                                                                       served as a reference font for SMuFL, commercial
through historical periods spanning the Renaissance to
                                                                       SMuFL-compliant music fonts are intended to be used in
the 20th century avant garde, was considered large at that
                                                                       existing music notation programs.
time. Before SMuFL, the extension of November’s
                                                                         At the present time, no currently available notation
repertoire had often been considered, but it would have
                                                                       software officially directly supports SMuFL, though such
most likely led to the multiplication of font files, as had
                                                                       support is likely forthcoming in the future. In the short- to
occurred with, for example, Opus or Maestro, which the
                                                                       medium-term, therefore, a SMuFL-compliant font like
designer was reluctant to do, and consequently only small
                                                                       November 2.0 must still be packaged specifically for
updates had been made over the years.
                                                                       each notation program. The SMuFL metadata, for
6.2 Moving to SMuFL                                                    instance, is currently not consumed at all by any of the
                                                                       major existing applications (including Finale, Sibelius,
The emergence of SMuFL back in 2013 was a great                        and LilyPond), and idiosyncratic component files 10 must
opportunity for November to make a bigger jump: one                    be supplied along with the font in order to ensure a
single font file with a greatly extended range of                      smooth user experience.
characters, wrapped in OpenType, and complying with a                    But in a positive way, the claim of SMuFL-compliance
new standard.                                                          for a popular music font like November can potentially
  By switching to SMuFL, the font designer, who                        help serve as an impetus for the developers of music
generally is a single individual, must be ready to face the            notation software to support SMuFL more quickly.
temptation of adding more and more symbols, making the
development process potentially much longer. 8 And not                                  7. SUPPORT FOR SMUFL
only must the designer deal with thousands of vectors and
                                                                       SMuFL 1.0 was released in June 2014. The standard
points, but also to some extent he or she must turn into a
                                                                       remains under active development, and it is hoped that an
programmer. Python scripting, for instance, can be a
                                                                       increasing number of software developers and font
great ally for generating the required metadata
                                                                       designers will adopt it for their products. Below is a
automatically; this was used extensively for the
                                                                          9
                                                                             November 2.0 was made with the open source program FontForge
                                                                       (http://fontforge.github.io/), which has a powerful Python
                                                                       interface.
  8 Somehow the designer could not resist this temptation with             10
                                                                              Finale’s Font Annotations and Libraries, Sibelius’s House Styles,
November 2.0 in any case!                                              LilyPond’s snippets…
                                                                 153
summary of the projects that have been publicly                       It is also expected that MusicXML, a widely-used
announced that are making use of SMuFL.                             format for the interchange of music notation data between
                                                                    software of various kinds, will develop closer ties to
7.1 Software with SMuFL support                                     SMuFL in its next major revision, version 4.0, which may
Steinberg’s forthcoming scoring application will support            necessitate some changes to SMuFL.
SMuFL-compliant fonts.
  The open source scoring application MuseScore                                            9. CONCLUSIONS
supports SMuFL-compliant fonts in version 2.0, which is             In this paper, a new standard for the layout of musical
currently in beta testing.11                                        symbols into digital fonts has been outlined. The new
  The web browser-based interactive sheet music and                 standard, called the Standard Music Font Layout
guitar tablature software Soundslice uses SMuFL and                 (SMuFL) is appropriate for modern technologies such as
Bravura for its music notation display. 12                          Unicode and OpenType. Through community-driven
                                                                    development, the standard has reached version 1.0 and
  The open source Music Encoding Initiative (MEI)
                                                                    includes nearly 2400 characters, categorized into 104
rendering software, Verovio, also uses SMuFL for its
                                                                    groups, and is poised for future expansion as necessary.
music notation display.13                                           A reference font family, Bravura, has been developed to
  The commercial scoring application Finale, from                   promote the adoption of the new standard. Both SMuFL
MakeMusic Inc., will support SMuFL in a future                      and Bravura are available under permissive free software
version14. MakeMusic’s MusicXML import/export plug-                 licenses, and are already being adopted by software
in for Finale, Dolet, supports SMuFL as of version 6.5.15           developers and font designers.
  The commercial digital audio workstation application
Logic Pro X, from Apple Inc., supports SMuFL and is                 Acknowledgements
compatible with Bravura from version 10.1.16                        SMuFL is developed in the open by a community of
                                                                    music software developers, academics, font designers,
7.2 Fonts with SMuFL support                                        and other interested parties. Too many people to list here
                                                                    have contributed to the development of the standard to
In addition to the reference font Bravura, other SMuFL-
                                                                    date, and their contributions have been of great value to
compliant music fonts are beginning to be available.
                                                                    the project.18
  MuseScore 2.0 includes SMuFL-compliant versions of
Emmentaler and Gootville, based respectively on the                                       10. REFERENCES
Emmentaler and Gonville fonts designed for use with
                                                                    [1] ECMA-404, The JSON Data Interchange Format,
LilyPond.
                                                                        1st edition, 2013.
  Verovio includes a SMuFL-compliant font called
Leipzig.                                                            [2] N. Spalinger and V. Gaultney, SIL Open Font
  Robert Piéchaud has designed an updated version of his                License (OFL), 2007.
November font family that is SMuFL-compliant17.
              8. FUTURE DIRECTIONS
Although SMuFL has reached version 1.0 and contains
an enormous range of characters, it remains under active
development, and further minor revisions are expected
for the indefinite future as new characters are identified,
proposed, and accepted for inclusion, and as the need for
new or improved metadata is identified.
  11
     See http://musescore.org/en/node/30866
  12
     See http://www.soundslice.com
  13
     See https://rism-
ch.github.io/verovio/smufl.xhtml?font=Leipzig
  14
     See http://www.sibeliusblog.com/news/finale-
2014d-and-beyond-a-discussion-with-makemusic/
  15
     See http://www.musicxml.com/dolet-6-5-finale-
plugin-now-available/
  16
     See http://support.apple.com/en-us/HT203718
  17                                                                   18
     See http://www.klemm-                                               A full list of contributors is printed in the SMuFL specification,
music.de/notation/november/index.php                                which can be found at http://www.smufl.org/download
                                                              154
            SVG TO OSC TRANSCODING: TOWARDS A PLATFORM FOR
            NOTATIONAL PRAXIS AND ELECTRONIC PERFORMANCE
                                                              Rama Gottfried
                                           Center for New Music and Audio Technolgies (CNMAT)
                                                      University of California, Berkeley
                                                   [email protected]
                                                                           155
Figure 1. Screenshot from Max for Live environment showing breakpoint functions used to control flocking behaviors of virtual sources
in a piece using spatial audio. On the right side of the screen are OpenGL visualization of point locations generated by the many
parameters contained within the algorithm. In a more symbolic notation environment, the parameters of the resulting rendering would
be representational of their function.
       2. CONTEXTUAL EXAMPLE:                                        sound over metric time, and composers trained in this tradi-
  COMPOSING FOR SPATIAL AUDIO SYSTEMS                                tion learn to silently hear through the spatial organization
                                                                     of symbols on paper. Similarly, we might develop inner
Mixed works for live acoustic instruments and real-time
                                                                     spatial perception by drawing from a long history of dance
spatial audio rendering systems present significant chal-
                                                                     notation to describe spatial movement [4], which could
lenges for relating the graphic visualizations of spatial pro-
                                                                     be used to control a rendering system like Spat. What is
cessing with the traditional musical notations used in a
                                                                     needed is a symbolic graphic environment to explore ways
score.
                                                                     of composing for these new types of media contexts, we
  For example, Ircam’s “Spat” 2 MaxMSP 3 library for spa-
                                                                     have many new tools for controlling media, but very few
tial audio processing comes equipped with several useful
                                                                     ways of utilizing symbolic notation practice in these con-
forms of representation for visualizing spatial processing.
                                                                     texts.
The simplest visualization tools in Spat display the place-
ment of a point source in two dimensions, viewed from
                                                                     2.1 Perceptual representations and breakpoint
either “top” (XY) or “front” (XZ) vantage points. The 2D
                                                                     functions
representation makes the location of points very clear and
minimizes occlusion issues. For 3D visualization Spat’s              Composition for spatial processing systems typically oc-
viewer may be easily integrated with OpenGL tools in Max’s           curs in media programming environments or digital audio
Jitter library. These are valuable methods for visualizing           workstations, where the compositional approach must be
and developing a conceptual basis for spatial design, but            contextualized within the types of controls provided by the
as interactive graphic user interfaces they are time-variant,        software tools. As with interactive UIs, real-time process-
and so do not provide a clear mechanism for relating spa-            ing is time-variant, and so the control parameters of a given
tial processing with score notated actions to be performed           process need to be contextualized in time if a score is de-
by the instrumentalist.                                              sired 4 . This type of system has a natural affordance to-
  Part of this problem is that there is no widely adopted no-        wards the triggers and breakpoint functions, which are ex-
tation system for incorporating spatial movement within a            tremely useful for fine control over the movement of one
musical score. Interactive UIs are an intuitive way to ex-           value (Y) over time (X).
periment and learn the expressive capabilities in new me-              This 2D representation, however, requires our spatial per-
dia contexts, however when seeking to compose temporal               ception to be fragmented into three separate parameters
structures, time must be represented as well. Traditional            (X-Y-Z or azimuth-elevation-distance). Working in this
music notation symbolically represents the articulation of           perforated situation the user must compose each parame-
  2   http://forumnet.ircam.fr/product/spat/                             4 Keeping in mind that the score does not necessarily need to describe
  3   https://cycling74.com                                          all events as “fixed” in time.
                                                               156
ter individually, so there is a natural tendency to focus on a                    rameters computationally, in particular LilyPond’s Scheme
smaller number of dimensions (e.g. a tendency to focus on                         based scripting language.
azimuth over distance). This computationally friendly rep-                          As we have been discussing, traditional notation was not
resentation comes at the expense of a more intuitive data                         designed to handle sound’s relationship to advanced instru-
manipulation, which is always one step removed in uni-                            mental techniques, spatial audio, dance, installation art,
variate control over multi-dimensional spaces. Figure 1                           and so on. Most traditional notation software has pre-
shows an example of this, where many lines of automation                          defined interpretations of the symbolic information con-
are composed to describe the spatial behavior of the three                        tained in the score, a prime example being the ubiquitous
dimensional space shown on the right.                                             MusicXML 9 format, which is designed specifically for
  The strength of a well-developed notation system is in the                      “music,” and so does not provide an optimal encoding for
way layers of contextual meanings are signaled by a com-                          symbolic notation for contexts traditionally thought of as
bination of symbols. For example, a staccato dot above a                          outside “music.” While most of the above musical notation
note head is immediately heard and physically felt in the                         programs do allow the user to create custom symbols, the
mind of a musician. There is an interpretive act that ac-                         user is bound to an underlying assumption that the score
companies a notation symbol that is bound up in a cul-                            is either to be read only by humans or to be performed as
tural history of practice and experience. This interpretive                       MIDI within a specifically musical context.
act may also be present for electronic musicians who have                           Computer aided composition tools such as Abjad 10 , Bach 11 ,
worked with breakpoint functions for many years, how-                             INScore 12 , MaxScore 13 , OpenMusic 14 , and PWGL 15 ,
ever, the breakpoint function is fundamentally a concrete                         provide environments for algorithmically generating musi-
control over a single parameter as a function of time, where                      cal scores, as well as providing connectivity to the types of
a symbolic representation is an aggregate of many parame-                         new media outputs mentioned above. However, these tools
ters, functioning through abstract, contextual implications                       rely on text input or visual programming, requiring the
for how it should be interpreted.                                                 artist to formalize their thought process to function within
  In order to take advantage of the expressivity contained                        the confines of a computational structure. In some cases
in symbolic notation into other media contexts, we need                           basic drawing tools are available, however they are limited
a way to experiment with different notational systems and                         in flexibility.
strategies outside of music notation.                                               Other experimental notation programs (e.g. GRM’s Acous-
                                                                                  mographe [7], IanniX 16 , MaxScore’s Picster 17 , Pure Data’s
         3. WHY NOT USE MUSIC NOTATION                                            Data Structures [8], etc.) provide new ways of perform-
                  SOFTWARE?                                                       ing graphic information, but they also contain symbolic
                                                                                  limitations, which are not found in a graphic design en-
Software has built-in affordances that simplify certain uses,
                                                                                  vironment, either through a forced method of playback, or
while making other approaches more difficult [6]. As mu-
                                                                                  through a limitation of graphic flexibility. Thus, at the mo-
sic notation has become increasingly digitalized over the
                                                                                  ment, purely graphic design tools seem to provide a more
last 20 years, software applications designed for music no-
                                                                                  flexible option for developing – and composing with – ap-
tation have also become increasingly specialized tools fo-
                                                                                  propriate notation systems. For this reason, many contem-
cusing on a specific context at the exclusion of others. Mu-
                                                                                  porary composers use Adobe Illustrator for creating their
sic notation software tools expose the author(s)’s idea of
                                                                                  scores.
what “music” is, through the types of functions they pro-
                                                                                    The UPIC (Unit Polyagogique Informatique CEMAMu)
vide their users.
                                                                                  project was one of the first to connect the act of human
  The most used music notation programs, Finale 5 , Lily-
                                                                                  drafting with digital sound resources [9]. This integration
Pond 6 , NoteAbility 7 , and Sibelius 8 , all target the pro-
                                                                                  of the drawing gesture is related to the working method
duction of traditional music scores, and also provide mech-
                                                                                  in discussion here, the design of the UPIC was however
anisms for MIDI playback of these scores. Many of these
                                                                                  very much tied to specific rendering contexts (amplitude
applications provide APIs for manipulating the musical pa-
  5                                                                                 9 http://www.musicxml.com
     http://www.finalemusic.com
  6                                                                                10 http://abjad.mbrsi.org
     http://www.lilypond.org
   7 http://debussy.music.ubc.ca/NoteAbility – note:                               11 http://www.bachproject.net
                                                                                   12 http://inscore.sourceforge.net
NoteAbility provides some support for communication with MaxMSP
                                                                                   13 http://www.computermusicnotation.com
through the use of breakpoint functions and qlist max messages, as well
                                                                                   14 http://repmus.ircam.fr/openmusic/home
as an option to export to Antescofo (http://repmus.ircam.fr/
                                                                                   15 http://www2.siba.fi/PWGL
antescofo) score following format. This is useful for traditional mu-
sic, but does not solve the problem of symbolic notation of the processes          16 http://www.iannix.org
                                                                            157
                                                                      is implied that they accept that due to software constraints
                                                                      their work is graphic, and either meant to be performed
                                                                      only by humans who will be able to interpret the score,
                                                                      or that the score is a descriptive notation of electronic re-
                                                                      sults rather than proscriptive notation of how to perform
Figure 2. An example of using Adobe Illustrator’s grouping
                                                                      the material. However this does not need to be the case.
mechanism to create a hierarchy of graphic data
                                                                      As a preliminary study implementation, the SVG output of
                                                                      Illustrator was used as a container for performable graphic
envelopes, waveforms, etc.) and so required specific kinds            information, leveraging Illustrator’s layer panel as a con-
of symbolic composition, which make it not extendable to              trol for hierarchical grouping.
other types of interpretation.
                                                                                       4. IMPLEMENTATION
3.1 Sketching and babbling
                                                                      Scalable Vector Graphic (SVG) 19 is an XML-based open
The recent MusInk [10] and InkSplorer [11] projects have              standard developed by the World Wide Web Consortium
shown that Livescribe 18 pen technology may also be a                 (W3C) for two dimensional graphics. In addition to being
way to connect symbolic thinking on paper with digital                widely supported in software applications, the SVG Stan-
rendering capabilities. The MusInk project also provides              dard provides several key features that make it an attractive
the capability to assign a type to an arbitrary symbol, which         solution for digital graphic notation: (1) it is human read-
is closely related to the present study, however since these          able which makes it easy to open an SVG file in a text edi-
Livescribe projects are designed for paper, they forgo some           tor and understand how the data is structured; (2) the SVG
of the possibilities offered by graphic design environments.          format provides a palette of primitive vector shapes that are
These studies point to the importance of sketching in de-             the raw building blocks for most notations (and also pro-
veloping new graphic ideas.                                           vides tags for adding further types); (3) inheriting XML’s
  Sketches are by definition incomplete, and provide the              tree structure model, SVG provides standardized grouping
mind with an image to reflect on and continually refine               and layer tags allowing users to create custom hierarchies
through iteration [12]. David Wessel describes this type              of graphic objects; and (4) the header for SVG files in-
of enactive engagement in his discussion of babbling as a             cludes the canvas information for contextualizing the con-
method for free experimentation in sensory-motor devel-               tent of the file.
opment in language and instrument learning, leading to-                 In this paper, we propose replacing the graphics renderer
wards the development of a “human-instrument symbio-                  with a new type of rendering interpretation, be it sonic,
sis” [13]. Such a symbiosis should also be possible with              spatial, kinetic, or any other possible output type. Thought
symbolic thought and computer controlled rendering sys-               of this way, the SVG file functions as hierarchical input
tems.                                                                 data, to be rendered, or performed by an electronic instru-
                                                                      ment. In our implementation, OpenSoundControl (OSC)
3.2 Performing digital graphic scores
                                                                      [14] serves as a transcoding layer used for processing an
Graphic design applications like Adobe Illustrator, InkScape,         interpretation of the SVG file structure.
OmniGraffle are created to have the basic affordances of a
drafting table: a piece of paper, pen, stencils, and ruler –          4.1 SVG → Odot → Performance
with the end goal of creating publication ready documents.            As a first test to interpret and perform the SVG score within
There are no built-in musical functions, no button for trans-         the Max environment, the LibXML2 20 library was used to
position, no MIDI playback, etc. What these applications              parse the SVG file created in Adobe Illustrator (figure 2),
provide are the basic tools for visually creating whatever            and convert the SVG tree (figure 3), into an OSC bundle
it is that you want to draw, generally in two dimensions.             (figure 4). For convenience, this was implemented in C
The user is left to decide what the meaning of the graphics           and put in a Max object called “o.svg”. The SVG graphic
might be.                                                             data was then reformatted, and interpreted for performance
  Composers who choose to work in graphic design pro-                 utilizing the “odot” OSC expression language developed at
grams rather than music notation programs are silently stat-          CNMAT over the last few years [15].
ing that they do not expect to be able to render their score            Based on the OSC protocol, CNMAT’s new odot library
with the computer in the way that a typical musical nota-             provides tools for handing OSC bundles as time-synchronized
tion program will have built in MIDI playback. Rather, it
                                                                       19   SVG Standard: http://www.w3.org/TR/SVG
 18   http://www.livescribe.com                                        20   http://xmlsoft.org
                                                                158
              Figure 3. The contents of the SVG file, showing the hierarchical graphic information designed in Figure 2.
                                                                          After transcoding the SVG file into OSC, the SVG data
                                                                          may be interpreted, and performed in Max through the odot
                                                                          library, allowing us to sort and iterate over the items, and
                                                                          to apply interpretive functions (figure 5).
Figure 4. The contents of the SVG file designed in Figure 2,              4.2 Grouping strategies
transcoded to Odot in the MaxMSP programming environment.
                                                                          With the transcoding from SVG to OSC in place it be-
                                                                          comes possible to begin composing within a graphic de-
hierarchical data structures within data-flow programming                 sign program in a way that facilitates the interpretation and
environments like Max, Pure Data (Pd) 21 , and NodeRed 22 .               performance downstream in OSC. Using the id attribute to
Through odot’s expression language, the OSC bundle be-                    identify groups and graphic objects, it is then possible to
comes a local scope for time-synchronized variables in which              use the SVG tree structure as a framework for developing
functions may be applied referencing the named contents                   grammars which can be used later to interpret the graphic
of the bundle [16]. Thus, by transcoding the SVG file con-                information for the generation of control parameters.
tents into an OSC bundle, it is possible to process the data                Taking traditional musical notation as a starting point, a
in Max/Pd, interpreting the values intended for graphic                   logical structural design to facilitate rendering might be
rendering as control parameters for synthesis, spatializa-                something like the one illustrated in figure 6. With the root
tion, and any other parameter controllable with the com-                  <svg> tag understood as the global container for a full
puter.                                                                    score, the next largest container would be the page, fol-
  The graphic content and grouping relations within an SVG                lowed by a system which might contain many instrument
file are described by the organization of XML elements,                   staves (or other output types), each with their own staff
and graphic primitives as specified by the SVG Standard.                  and clef. Within the individual staff group, there might be
For example, a group of SVG elements might look like                      graphic information providing the bounds of the staff (e.g.
this:                                                                     lines marking different qualities within the vertical range
<g id="note-duration-event">
  <circle id="notehead" cx="100" cy="300" r="3.76"/>
                                                                          of the staff as described by the clef; where the X dimension
  <line id="duration" fill="none" stroke="#000000" stroke-width=          usually (but not necessarily) representing time). Within
        "3" stroke-miterlimit="10" x1="100" y1="300" x2="200" y2=
        "300"/>                                                           this grammar structure, the bounds of the staff provide a
</g>
                                                                          context for interpreting event objects contained within the
Each SVG element follows a similar structure, the ele-                    staff group. Further, each event object grouping may con-
ment tag name is followed by a list of attributes to the                  tain any number of graphic objects. For example a note-
 21   http://puredata.info
                                                                          duration-event might contain a shape identified as a note-
 22   http://nodered.org                                                  head, with other graphic objects representing the event’s
                                                                    159
Figure 5. Example of storing interpretations of graphic information contained in the SVG file in Odot lambda functions. Here, the
function describes a process of interpretation where the x1 value indicates the start time, scaled to a given time constant, and the duration
is the horizontal span of the object.
page
system
stave
                                        note-duration-event
clef   staff bounds
                                                                                                       160
indeed be useful, providing a database and Model View              music notation are aggregate and contextual. A symbol in
Controller (MVC) architecture for interacting with score           a notation system is given meaning through the interpre-
data.                                                              tation of a human or computerized intelligence based on
  Adobe’s recently announced support for Node.js 23 opens          contextual understanding, for example the nature of a stac-
up several new options for working within the Illustrator          cato string articulation is different for different dynamic
application. For example, with odot’s SWIG 24 based JS             ranges. Complex rendering systems incorporating digital
bindings, a Node plug-in could be created to stream OSC            signal processing and/or other electronic media often have
score data directly from Illustrator without the need to save      a large number of parameters that artists wish to control
the file to disk and reload it with the o.svg object. With the     expressively. Due to the affordances of the programming
addition of Node as a plug-in backend this means that Pa-          environments in which these pieces are created, there is
per.js 25 could be used to create custom interactive GUIs          typically a focus on control of many single parameters.
for handling data. Paper.js is developed by the same team          However, the symbolic representation of information such
who wrote Scriptographer 26 which was a powerful JS based          as spatial location becomes fragmented in these systems,
drawing tool building suite for Illustrator, and was one of        forcing a point in space to be represented with three sepa-
the initial tools for an earlier version of our study. Un-         rate coordinates, which in many ways obscures its percep-
fortunately, Adobe drastically changed their plugin design         tual simplicity.
in Illustrator CS6, which broke Scriptographer. This is an           The SVG format provides a useful method for defining
important point to be considered for any future work in            meanings of symbols leveraging Illustrator’s grouping and
the present study, and is one indicator that possibly an in-       layering tools, while the graphic editing environment pro-
dependent design environment might be a more reliable              vided by graphic design programs like Illustrator provide a
long-term solution. A future iteration of the study might          flexible vector graphic drafting environment for symbolic
be in the form of a Node and Paper.js based editor with a          experimentation. Since Illustrator was designed without
stripped-down toolkit for symbolic graphic notation. The           musical applications in mind, there are no pre-conceived
Paper.js front-end would allow users to easily create their        playback limitations based on the application developer’s
own interactive tools, and either export a rendering of the        idea of what “music” is, or how graphic symbols on a page
score to SVG for printing with a program like Illustrator, or      should be organized. This lack of meaning leaves room
the score could be streamed via odot/SWIG. There is some           for the user to sketch and experiment, as well as requiring
possibility that INScore’s V8 27 integration might provide         extra effort to create meaning through an interpretive al-
a suitable platform for these developments, this would also        gorithm if the score is meant to be performed by the com-
allow the editor to take advantage of INScore’s MVC de-            puter. Transcoding SVG format into OSC facilitates the
sign, and traditional notation tools.                              interpretation of notation through the use of the odot ex-
  Other improvements might include a more intuitive sys-           pression language in the Max media programming envi-
tem for defining meaning for symbols. In the process of            ronment, providing digital artists a mechanism to perform
sketching and developing a notation, it was time consum-           graphic symbolic notation with any electronic media ac-
ing to constantly keep objects nicely grouped and labeled          cessible with Max, Pd, or any other application that can
using Illustrator’s layer and grouping tools. This issue can       interpret OSC.
be mitigated thought the use of search algorithms to auto-           Preliminary work on developing an interpretation and per-
detect symbol patterns (i.e. containing similar types of           formance system for notation stored in SVG format has
graphic objects, gestures, etc.) which would allow the artist      proven feasible, however there is still significant work needed
to later apply semantic structuring rule to different mem-         to bring the system to a point where it would be compet-
bers of these symbolic groupings.                                  itive with existing rendering systems that are specifically
                                                                   designed for a given medium. On the other hand the open-
                   6. CONCLUSION                                   ness of the SVG format, combined with its compatibility
                                                                   with OSC points towards a myriad of new ways to expres-
The authoring of data in computer music systems is pre-
                                                                   sively controlling new media formats with symbolic no-
dominately done through graphical representations of uni-
                                                                   tation. Looking towards the future, the above plans for a
variate functions, whereas symbolic notation systems like
                                                                   new symbolic graphic notation editor discussed in section
 23 http://www.adobe.com/devnet/
                                                                   5 seem to be a promising direction for the creation of no-
cs-extension-builder/articles/
extend-adobe-cc-2014-apps.html                                     tation software that is capable of being used to render new
 24 http://www.swig.org
                                                                   media forms that have proven difficult to notate (such as
 25 http://paperjs.org
 26 http://scriptographer.org                                      spatial audio), as well as those that have yet to be thought
 27 https://code.google.com/p/v8                                   of.
                                                             161
 Acknowledgments                                                [12] M. Tohidi, W. Buxton, R. Baecker, and A. Sellen,
                                                                     “User sketches: A quick, inexpensive, and effec-
 I would like to thank John MacCallum and Adrian Freed
                                                                     tive way to elicit more reflective user feedback,” in
 for their valuable feedback in developing this study, and
                                                                     NordiCHI.     Oslo, Norway: Nordic conference on
 in developing the odot system without which this work
                                                                     Human-computer interaction, 2006.
 would not exist. I would also like to thank Olivier Warus-
 fel, Markus Noisternig, and Thibaut Carpentier for their       [13] D. Wessel, “An enactive approach to computer music
 mentorship during my spatial composition residency at Ir-           performance,” in Le Feedback dans la Creation Musi-
 cam where many of these ideas were developed, and to                cal, Y. Orlarey, Ed. Lyon, France: Studio Gramme,
 Jean Bresson for his continued feedback and great work in           2006, pp. 93–98.
 the field of musical representation.
                                                                [14] A. Freed and A. Schmeder, “Features and future of
                                                                     open sound control version 1.1 for nime,” in NIME.
                    7. REFERENCES
                                                                     New Interfaces for Musical Expression, 2009.
 [1] K. Stone, “Problems and methods of notation,” Per-
     spectives of New Music, vol. 1, no. 2, pp. 9–31, 1963.     [15] A. Freed, J. MacCallum, and A. Schmeder, “Compos-
                                                                     ability for musical gesture signal processing using new
 [2] P. Manning, “The oramics machine: From vision to re-            osc-based object and functional programming exten-
     ality,” Organised Sound, vol. 17, no. 2, pp. 137–147,           sions to max/msp,” in NIME. Oslo, Norway: New
     August 2012.                                                    Interfaces for Musical Expression, 2011.
 [3] P. L. Smirnov A., “1917-1939. son z / sound in z.”         [16] J. MacCallum, A. Freed, and D. Wessel, “Agile inter-
     PALAIS / Palais de Tokyo Magazine, Paris, no. 7, pp.            face development using osc expressions and process
     pp. 66–77, 2008.                                                migration,” in NIME, Daejeon Korea, 2013.
 [4] B. Farnell, “Movement notation systems,” in The
     world’s writing systems, O. U. Press, Ed. New York,
     NY: Weidenfeld and Nicholson, 1996, pp. 855–879.
                                                              162
   ABJAD: AN OPEN-SOURCE SOFTWARE SYSTEM FOR FORMALIZED
                       SCORE CONTROL
ABSTRACT 2. A TAXONOMY
The Abjad API for Formalized Score Control extends the                          Many software systems implement models of music but
Python programming language with an open-source, object-                        few of these implement a model of notation. Many music
oriented model of common-practice music notation that                           software systems model higher-level musical entities appar-
enables composers to build scores through the aggregation                       ent in the acts of listening and analysis while omitting any
of elemental notation objects. A summary of widely used                         model of the symbols of music notation. Researchers and
notation systems’ intended uses motivates a discussion of                       musical artists have modeled many such extrasymbolic mu-
system design priorities via examples of system use.                            sical entities, such as large-scale form and transition [1–5],
                                                                                texture [6], contrapuntal relationships [7–13], harmonic ten-
                                                                                sion and resolution [14–16], melody [17, 18], meter [19],
                     1. INTRODUCTION                                            rhythm [20–22], timbre [23–25], temperament [26, 27] and
                                                                                ornamentation [28, 29]. This work overlaps fruitfully with
Abjad 1 is an open-source software system designed to
                                                                                analysis tasks because models of listening and cognition
help composers build scores in an iterative and incremen-
                                                                                can enable novel methods of high-level musical structures
tal way. Abjad is implemented in the Python 2 program-
                                                                                and transformations, like dramatic direction, tension, and
ming language as an object-oriented collection of packages,
                                                                                transition between sections [30].
classes and functions. Composers can visualize their work
                                                                                  Software production exists as an organizationally designed
as publication-quality notation at all stages of the compo-
                                                                                feedback loop between production values and implemen-
sitional process using Abjad’s interface to the LilyPond 3
                                                                                tation [31]. It is possible to understand a system by under-
music notation package. The first versions of Abjad were
                                                                                standing the purpose for which it was initially designed.
implemented in 1997 and the project website is now visited
                                                                                This purpose can be termed a software system’s genera-
thousands of times each month. This paper details some
                                                                                tive task. In the classfication of systems created for use
of the most important principles guiding the development
                                                                                by artists, this priority yields a dilemma instantly, as anal-
of Abjad and illustrates these with examples of the sys-
                                                                                yses that explain a system’s affordances with reference
tem in use. The priorities outlined here arise in answer to
                                                                                to intended purpose must contend with the creative use
domain-specific questions of music modeling (What are the
                                                                                of technology by artists: a system’s intended uses might
fundamental elements of music notation? Which elements
                                                                                have little or nothing in common with the way in which
of music notation should be modeled hierarchically?) as
                                                                                the artist finally uses the technology. For this reason, the
well as in consideration of the ways in which best practices
                                                                                notion of generative task is best understood as an explana-
taken from software engineering can apply to the develop-
                                                                                tion for a system’s affordances, with the caveat that a user
ment of a music software system (How can programming
                                                                                can nonetheless work against those affordances to use the
concepts like iteration, aggregation and encapsulation help
                                                                                system in novel ways.
composers as they work?). A background taxonomy mo-
tivates a discussion of design priorities via examples of                         While composers working traditionally may allow intu-
system use.                                                                     ition to substitute for formally defined principles, a com-
                                                                                puter demands the composer to think formally about mu-
  1 http://www.projectabjad.org                                                 sic [32]. Keeping in mind generative task as an analytical
  2 http://www.python.org
  3 http://www.lilypond.org
                                                                                framework, it is broadly useful to bifurcate a notation sys-
                                                                                tem’s development into the modeling of composition, on
                                                                                the one hand, and the modeling of musical notation, on
Copyright: ©2015 Trevor Bača et al. This is an open-access article             the other. All systems model both, to greater or lesser
distributed under the terms of the Creative Commons Attribution 3.0             degrees, often engaging in the ambiguous or implicit mod-
Unported License, which permits unrestricted use, distribution, and re-         eling of composition while focusing more ostensibly on a
production in any medium, provided the original author and source are           model of notation, or focusing on the abstract modeling
credited.                                                                       of composition without a considered link to a model of
                                                                          163
notation. Generative task explains a given system’s balance       interchange format for over 160 applications and maintains
between computational models of composition and nota-             a relatively application-agnostic status, as it was designed
tion by assuming a link between intended use and system           with the generative task of acting as an interchange format
development.                                                      between variously-tasked systems [50].
  Many notation systems — such as Finale, Sibelius, SCORE           An attempt to survey more comprehensively the history
[33], Igor, Berlioz, Lilypond [34], GUIDO [35] NoteAbil-          of object-oriented notation systems for composition, in the
ity [36], FOMUS [37, 38] and Nightingale — exist to help          context of the broader history of object-oriented program-
people engrave and format music documents; because these          ming, lies beyond the scope of this paper but has recently
systems provide functions that operate on notational ele-         been undertaken elsewhere [51].
ments (i.e., transposition, spacing and playback), hidden
models of common-practice music notation must underly                                   3. ABJAD BASICS
all of these systems, and each system’s interface constrains
                                                                  Abjad is not a stand-alone application. Nor is Abjad a pro-
and directs the ways in which users interact with this un-
                                                                  gramming language. Abjad instead adds a computational
derlying model of notation. These systems enable users to
                                                                  model of music notation to Python, one of the most widely
engrave and format music without exposing any particular
                                                                  used programming languages currently available. Abjad’s
underlying model of composition, and without requiring, or
                                                                  design as a standard extension to Python makes hundreds
even inviting the user to computationally model composi-
                                                                  of print and Web programming resources relevant to com-
tion. Such systems might go so far as to enable scripting,
                                                                  posers and further helps to make the global communities
as in the case of Sibelius’s ManuScript [39] scripting lan-
                                                                  of software developers and composers available to each
guage or Lilypond’s embedded Scheme code; although
                                                                  other. 4 5 Composers work with Abjad exactly the same as
these systems enable the automation of notational elements,
                                                                  with any other Python package. In the most common case
it remains difficult to model compositional processes and
                                                                  this means opening a file, writing code and saving the file:
relationships.
                                                                   from abjad import *
  Other systems provide environments specifically for the
modeling of higher-level processes and relationships. Open-        def make_nested_tuplet (
                                                                       tuplet_duration ,
Music [40], PWGL [41] and BACH [42] supply an inter-                   outer_tuplet_proportions ,
face to a model of common-practice notation, as well as a              inner_tuplet_subdivision_count ,
                                                                       ):
set of non-common-practice visual interfaces that enables              outer_tuplet = Tuplet . from_duration_and_ratio (
the user to model composition, in the context of a stand-                  tuplet_duration , outer_tuplet_proportions )
                                                                       inner_tuplet_proportions = \
alone application and with the aid of the above notation                   inner_tuplet_subdivision_count * [1]
editors for final engraving and layout via intermediate file           last_leaf = outer_tuplet . select_leaves ()[ -1]
                                                                       inspector = inspect_ ( last_leaf )
formats. Similarly purposed systems extend text-based pro-             right_logical_tie = inspector . get_logical_tie ()
gramming languages rather than existing as stand-alone ap-             right_logical_tie . to_tuplet ( inner_tuplet_proportions )
                                                                       return outer_tuplet
plications, such as HMSL’s extension of Forth [43], JMSL’s
extension of Java [44] and Common Music’s extension of            The classes, functions and other identifiers defined in the
Lisp [45]. Other composition modeling systems, such as            file can then be used in other Python files or in an interactive
athenaCL [46] and PILE/AC Toolbox [47] provide a vi-              session:
sual interface for the creation of compositional structures         >>>   rhythmic_staff = Staff ( context_name = ' RhythmicStaff ')
without providing a model of common-practice notation.              >>>   tuplet = make_nested_tuplet ((7 , 8) , (3 , -1 , 2) , 3)
                                                                    >>>   rhythmic_staff . append ( tuplet )
  Some composers make scores with software systems that             >>>   show ( rhythmic_staff )
provide neither a model of notation nor a model of com-                                       12:7
position. Graphic layout programs, such as AutoCAD and                                                         3:2
                                                            164
a PDF of the result. But note that composers work with Ab-                       and chords may also be initialized with the pitch numbers
jad primarily by typing notationally-enabled Python code                         of American pitch-class theory or from combinations of
into a collection of interrelated files and managing those                       Abjad pitch and duration objects. Unlike many notation
files as a project grows to encompass the composition of an                      packages, Abjad does not require composers to structure
entire score.                                                                    music into measures. All Abjad containers can hold notes,
                                                                                 rests and chords directly.
              4. THE ABJAD OBJECT MODEL                                            Ties, slurs and other spanners attach to score components
                                                                                 via Abjad’s top-level attach() function. The same is true
Abjad models musical notation with components, spanners                          for articulations, clefs and other indicators. For example,
and indicators. Every notational element in Abjad belongs                        after selecting the notes, rests and chords from each staff, in-
to one of these three families. Abjad models notes, rests                        dividual components and slices of contiguous components
and chords as classes that can be added into the container-                      may be selected by their indices within each selection. 7
like elements of music notation, such as tuplets, measures,                      Indicators and spanners may then be attached to those com-
voices, staves and complete scores. Spanners model nota-                         ponents:
tional constructs that cross different levels of hierarchy in
                                                                                   >>>   upper_leaves = upper_staff . select_leaves ()
the score tree, such as beams, slurs and glissandi. Indicators                     >>>   lower_leaves = lower_staff . select_leaves ()
model objects like articulations, dynamics and time signa-                         >>>   attach ( Tie () , upper_leaves [4:6])
                                                                                   >>>   attach ( Tie () , upper_leaves [ -3: -1])
tures that attach to a single component. Composers arrange                         >>>   attach ( Slur () , upper_leaves [:2])
components hierarchically into a score tree with spanners                          >>>   attach ( Slur () , upper_leaves [2:6])
                                                                                   >>>   attach ( Slur () , upper_leaves [7:])
and indicators attached to components in the tree. 6                               >>>   attach ( Articulation ( ' accent ') , upper_leaves [0])
                                                                                   >>>   attach ( Articulation ( ' accent ') , upper_leaves [2])
                                                                                   >>>   attach ( Articulation ( ' accent ') , upper_leaves [7])
              5. BOTTOM-UP CONSTRUCTION                                            >>>   attach ( Clef ( ' bass ') , lower_leaves [0])
                                                                                   >>>   show ( score )
Abjad lets composers build scores from the bottom up.                                                   3:2
When working bottom-up, composers create individual
                                                                                 
                                           ) )	
                                                                                                              5:4            5:4
                                                                                       )
notes, rests and chords to be grouped into tuplets, measures
                                                                                    
  )	 ) )                        )               )
or voices that may then be included in even higher-level                                                                     )
                                                                                                 )                                    
containers, such as staves and scores. Abjad affords this                          
 )                               )          )
style of component aggregation via a container interface                         
which derives from Python’s mutable sequence protocol.
                                                                   When does it make sense for composers to work with Abjad
Python’s mutable sequence protocol specifies an interface
                                                                   in the bottom-up way outlined here? Instantiating compo-
to list-like objects. Abjad’s container interface implements
                                                                   nents by hand in the way shown above resembles notating
a collection of methods which append, extend or insert into
                                                                   by hand and composers may choose to work bottom-up
Abjad containers:
                                                                   when doing the equivalent of sketching in computer code:
 >>>   outer_tuplet_one = Tuplet ((2 , 3) , "d ' '16 f '8. ")      when making the first versions of a figure or gesture, when
 >>>   inner_tuplet = Tuplet ((4 , 5) , " cs ' '16 e '16 d '2" )
 >>>   outer_tuplet_one . append ( inner_tuplet )                  trying out combinations of small bits of notation or when in-
 >>>   outer_tuplet_two = Tuplet ((4 , 5))                         serting one or two items at a time into a larger structure. For
 >>>   outer_tuplet_two . extend (" d '8 r16 c '16 bf '16 ")
 >>>   tuplets = [ outer_tuplet_one , outer_tuplet_two ]           some composers this may be a regular or even predominant
 >>>   upper_staff = Staff ( tuplets , name = ' Upper Staff ')     way   of working. Other composers may notice patterns in
 >>>   note_one = Note (10 , (3 , 16))
 >>>   upper_staff . append ( note_one )                           their own compositional process when they work bottom-up
 >>>   note_two = Note ( NamedPitch ( " fs '") , Duration (1 , 16))and may find ways to formalize these patterns into classes
 >>>   upper_staff . append ( note_two )
 >>>   lower_staff = Staff ( name = ' Lower Staff ')               or functions that generalize their work; the next section
 >>>   lower_staff . extend (" c8 r8 b8 r8 gf8 r4 cs8 " )          describes some ways composers do this.
 >>>   staff_group = StaffGroup ()
 >>> staff_group . extend ([ upper_staff , lower_staff ])
 >>> score = Score ([ staff_group ])
 >>> show ( score )                                                                           6. TOP-DOWN CONSTRUCTION
                      3:2
                                                                                 What are the objects of music composition? For most com-
                            5:4            5:4
       )
    
  ) ) ) 	                           ) )
                                                                                 posers, individual notes, rests and chords constitute only
                                      )       )             )                   the necessary means to achieve some larger, musically in-
                                                                                 teresting result. For this reason, a model of composition
                                         
                     )
                                                                                 needs to describe groups of symbols on the page: notes
                                      )
          )                                            )                        taken in sequence to constitute a figure, gesture or melody;
                                                                                    7 Python allows indexing into sequences by both positive and negative
Notes and chords may be initialized with pitches named ac-
                                                                                 indices. Positive indices count from the beginning of the sequence, starting
cording to either American or European conventions. Notes                        at 0, while negative indices count from the end of the sequence, with -1
                                                                                 being the last item in the sequence and -2 the second-to-last. Subsegments
   6 Abjad chords aggregate note-heads instead of notes. This corrects a         of a sequence may be retrieved by slicing with an optional start and
modeling problem sometimes present in other music software systems: if           optional stop index. The slice indicated by [1:-1] would retrieve all of
chords aggregate multiple notes and every note has a stem then how is it         the items in a sequence starting from the second and going up until, but
that chords avoid multiple stems? Abjad chords implement the container           not including, the last. The slice indicated by [:3], which omits a start
interface described below to add and remove note-heads to and from               index, retrieves all items from the sequence up until, but not including, the
chords.                                                                          fourth.
                                                                           165
chords taken in sequence as a progression; attack points                                                                    6:5                           5:4
                                                                                                                                                                        )
  As an example, consider the rhythmmakertools package                    89 )            ) )                               ) )             )          ) )
included with Abjad. The classes provided in this package
are factory classes which, once configured, can be called                    7. SELECTING OBJECTS IN THE SCORE
like functions to inscribe rhythms into a series of beats or
                                                             Abjad allows composers to select and operate on collections
other time divisions. The example below integrates config-
                                                             of objects in a score. Composers can select objects in
urable patterns of durations, tupletting and silences:
                                                             several ways: by name, numeric indices or iteration. A
                                                             single operation, such as transposing pitches or attaching
 >>>                                                         articulations,
        burnish_specifier = rhythmmakertools . BurnishSpecifier (           can then be mapped onto the entirety of a
 ...        left_classes =( Rest , Note ),                   selection.
 ...        left_counts =(1 ,) ,
 ...        )                                                  Consider the two-staff score created earlier. Because both
 >>>    talea = rhythmmakertools . Talea (                   staves were given explicit names, the upper staff can be
 ...        counts =(1 , 2, 3) ,
 ...        denominator =16 ,                                selected by name:
 ...         )                                                         >>> upper_staff = score [ ' Upper Staff ']
 >>>    tie_specifier = rhythmmakertools . TieSpecifier (              >>> show ( upper_staff )
 ...         tie_across_divisions = True ,
 ...         )
                                                                                                     3:2
 >>>    rhythm_maker = rhythmmakertools . TaleaRhythmMaker (                                                   5:4                          5:4
                                                                              )
 ...         burnish_specifier = burnish_specifier ,
 ...         extra_counts_per_division =(0 , 1 , 1) ,
                                                                             ) ) )                                          )
                                                                                                                                             ) )                         )
 ...         talea = talea ,
                                                                                                                                            )
 ...         tie_specifier = tie_specifier ,                                                                                                  
 ...         )
 >>>    divisions = [(3 , 8) , (5 , 16) , (1 , 4) , (3 , 16)]         Using numeric indices, the lower staff can be selected by
 >>>    show ( rhythm_maker , divisions = divisions )                 indexing the second child of the first child of the score:
                                                                       >>> lower_staff = score [0][1]
                                                                       >>> show ( lower_staff )
                                                                                                                                                                          
                                                                                                                              
 83               5
                  16     6:5
                                41     5:4
                                                  3
                                                 16                                                   ë              
                                                                                                                                  ë
                                                                                                                                                                    
                                                                      The top-level iterate() function exposes Abjad’s score
   )      )       ) )     )      )         )   ) ) )
                                                                      iteration interface. This interface provides a collection of
                                                                      methods for iterating the components in a score in different
Once instantiated, factory classes like this can be used over         ways. For example, all notes can be selected from a single
and over again with different input:                                  staff:
                                                                       >>> for note in iterate ( lower_staff ). by_class ( Note ):
                                                                       ...      attach ( Articulation ( ' staccato ') , note )
 >>> rhythmic_score = Score ()                                         ...
 >>> for i in range (8):                                               >>> show ( score )
 ...      selections = rhythm_maker ( divisions , seeds = i)
 ...      measure = Measure ((9 , 8) , selections )                                                  3:2
                                                                            
                                                                        
   	   
 ...      staff . append ( measure )
 ...      rhythmic_score . append ( staff )
                                                                                                                                                 
 ...      divisions = sequencetools . rotate_sequence (
                                                                                                                                               
 ...           divisions , 1)
                                                                                                                           
                                                                                                                                             
                                                                      
 ...
 >>> show ( rhythmic_score )
                                                                166
notes or chords joined together by consecutive ties. The               to ensure that all tests pass after every commit from every
‘logical’ qualifier points to the fact that Abjad considers            core developer and newcomer to the project alike.
untied notes and untied chords as logical ties of length 1,              The presence of automated regression tests acts as an in-
which makes it possible to select untied notes and chords              centive to new contributors to the system (who can test
together with tied notes and chords in a single method call:           whether proposed changes to the system work correctly
  >>> for logical_tie in iterate ( score ). by_logical_tie ():with existing features) and greatly increases the rate at
  ...         if 1 < len ( logical_tie ):                              which experienced developers can refactor the system dur-
  ...             attach ( Fermata () , logical_tie . tail )
  ...             for note in logical_tie :                            ing new feature development. Abjad currently comprises
  ...                    override ( note ). note_head . style = ' crossabout
                                                                       '      178,000 lines of code. The Abjad repository, hosted
  ...                                                                               11
  >>> show ( score )                                                   on  GitHub,     lists more than 8.7 million lines of code com-
                                                                       mitted since the start of the project. This refactor ratio of
                                                                      about 50:1 means that each line of code in the Abjad code-
                     3:2
                                    
                           5:4          5:4
         
                               	                          base has been rewritten dozens of times. The freedom to
                                                        
                                                    
                                                                       refactor at this rate is possible only because of the approach
                                
                                   
                                 to automated regression testing Abjad has borrowed from
 
                    ë                            
                                   ë                                   the larger open-source community.
                                                                         Testing benefits project continuance when the original
                                                                       developers of a music software system can no longer de-
   8. PROJECT TESTING AND MAINTENANCE                                  velop the system. Automated regression tests help make
                                                                       possible a changing of the guard from one set of developers
Abjad has benefited enormously from programming best                   to another. Automated tests serve as a type of functional
practices developed by the open-source community. As                   specification of how a software system should behave after
described previously, the extension of an existing language            revision. While automated tests alone will not ensure the
informs the project as a first principle. The following other          continued development of any software system, adherence
development practices from the open-source community                   to the testing practices of the open-source community con-
have also positively impacted the project and might be                 stitutes the most effective hedge available to music software
helpful in the development of other music software systems.            systems developers to fend against project abandonment in
  The literature investigated in preparing this report remains         the medium and long term.
overwhelmingly silent on questions of software testing.
None of the sources cited in this article reference software
test methodologies. The same appears to be true for the
                                                                                 9. DISCUSSION & FUTURE WORK
larger list of sources included in [51]. 8 Why should this
be the case? One possibility is that authors of music soft-
                                                                       The design and development priorities for Abjad outlined
ware systems have, in fact, availed themselves of important
                                                                       here derive from the fact that the developers of Abjad are
improvements in software test methods developed over the
                                                                       all composers who use the system to make their own scores.
previous decades but have, for whatever reasons, remained
                                                                       Abjad is not implemented for the type of music information
quiet on the matter in the publication record. Perhaps the
                                                                       storage and retrieval functions that constitute an important
culture of software best practices now widely followed
                                                                       part of musicology-oriented music software systems. Nor is
in the open-source community simply has not yet arrived
                                                                       Abjad designed for use in real-time contexts of performance
in the field of music software systems development (and
                                                                       or synthesis. Abjad is designed as a composers’ toolkit for
especially in the development of systems designed for non-
                                                                       the formalized control of music notation and for modeling
commercial applications).
                                                                       the musical ideas that composers use notation to explore
  The use of automated regression testing in Abjad’s de-               and represent. For example, figure 1 shows a one-page ex-
velopment makes apparent the way in which tests encour-                cerpt from a score constructed entirely with tools extending
age efficient development and robust project continuance.              Abjad and typeset with LilyPond. Although Abjad embeds
Abjad comprises an automated battery of 9,119 unit tests               well in other music software systems, future work planned
and 8,528 documentation tests. Unit tests are executed by              for Abjad itself does not prioritize file format conversion,
pytest. 9 Documentation tests are executed by the doctest              audio synthesis, real-time applications or graphic user inter-
module included in Python’s standard library. Parameter-               face integration. Future work will instead extend Abjad for
ized tests ensure that different classes implement similar             object-oriented control over parts of the document prepara-
behaviors in a consistent way. Developers run the entire               tion process required of complex scores with many parts.
battery of tests at the start of every development session.            Future work will also extend and reinforce the inventory of
No new features are accepted as part of the Abjad codebase             factory classes and factory functions introduced in this re-
without tests authored to document changes to the system.              port. We hope this will encourage composers working with
Continuous integration testing is handled by Travis CI 10              Abjad to transition from working with lower-level symbols
    8 AthenaCL [46] and Music21 [52] are important exceptions. Both             of music notation to modeling higher-level ideas native to
projects are implemented in Python and both projects feature approaches         one’s own language of composition.
to testing in line with those outlined here.
    9 http://pytest.org
  10 https://travis-ci.org                                                       11   https://github.com/Abjad/abjad
                                                                          167
       93
                                                                                           3                                                                                                             6                                                                                                                  2
                                                                                           8                                                                                                             8                                                                                                                  4
                                                                                                                                                    M.S.T.                                              M.S.P                                                         M.S.T.                                                S.P.                                      S.T.
                       ! ""                         &                                              "                                                   &                                                        "                                                          &                                                    &                                              "               &
              /
                  .                                                                                                                                                                      %                                                            %                                                %                                               %
                       1                            4                                          1                                                      4                        1                                4                                                                          1                                             4                                1                         1
                       1                                                                       1                                                                               1                                                           3                              2                1                                    3                                         1                   2     1
                                                    5                                                                                                 5                                                         5                          5                                                                                    5        5
                                                                                                                                                                                                                                                                          5                                                                                                                   5
                                                                                                       )                                                   )                                                        )                                                          )
                                                   4:5
                    p                                                    ppp                            p            ppp                                                                                         mf                                                         ppp            p                                    mp                                       p             ppp
      Va. 1
                                                                                                             ! = 72
       98
               3                                                                                             5                                                                                                                                                                                             3
               8                                                                                             8 G                             Selidor (iii)                                                                                                                                                 8
                                    ORD.                                                                                                     ORD.                                                           ORD.                   S.P.                                                                    ORD.
                                      "                         &                                                                              &                           "                                                        ! ""                          &                                         ! ""
              /                                                                                                )) )) %
               . %                                                              1
                                     4                                   4                                                                                                 1                                    4                   4                                              4                        4
                                                                3                                                                                                          1                                                                                      3
                                     5                          5        5                                                                     0                                                                5                   5                             5                5                        5
                                                                                                                                               1
                                                                                                                                                                                                                                                                                                                   )
                                                                                                                                                                                                                    $
      Va. 1                                                                                                                                                                                                                                                               3:2
                                      mf                                                                                                       p                                                                 ppp p                                                ppp
                                     '#                     '#                                                                                  + ****************                                                + ********************** + ************************                                                                                                              '(
               - %                                                              1                              )) )) %                         # ! )& "            #                                            2#                        # ! )& "                    #)
              0
                                                                                                                                                                                                                    ! % & "                                                                                                                                                    &
                                                             $
                                                                                                                                                                                                                                                                                                                                                                               #
                                                                                                                                                                                                                                                                                                                                                                               ! "
4:5
                                                                                                                                                                                                                                                                                                               $
                                                                                                                                                               4:5
                            ppp              mp                                                                             p                                                                                       ppp                                   p                                                    ppp                  p
      Va. 2
                    + *******
                                                                                                                      + *********************                                                                                                      + ************ + ****************
                                                                                                                                                                                                                                                                                                                                                       '( '(
                                                            '#
                                                                                                                                                                                                                                                 '(
                 ! # ! %# "
                                                                                                                                                                                                        '( '(
               -                %          ,#                                               %                  )) )) # ! )& "                 #                                                                                                ##                                   2#                                                               # ##
                                                                                                                                                                                                                 #                                               # ! )& "
                                                                                                                                                                                                      &&                                       & ! )& "                                                                                              &&
              0 $
                                                                                                                                                                                                                                               ! "                                                                                                   ! "! "
                                            $
                                                                                                                                                                                                      ##
                                                                                                                                                                                                      ! "! "
                                                                                                                                                                                                                                                                                                                                                              $
                                                                                                                                                               4:5                                                                                                6:5
      103
                                                            3                                                                                                              6                                                                                                                   5
                                                            4                                                                                                              8                                                                                                                   8
                      ORD.                                              M.S.P              M.S.P                                                                                              ORD.                                                                                             S.P.                                                                                                     ORD.
                        &                                       "                              "                            &                         "                                        ! ""                                                           &                                                                               ! ""                                      &
              /
                  .                                                             %
                                                                4                           4                                                         1                         4              4                                                                                                4                                             4
                       3                                                 3                                                                            1                                                                                                   3                                                                                                                             3
                       5                                        5        5                  5                              0                                                    5              5                                                          5                                     5                                             5                                         5
                            )                                                                                              1
                                                                                                                                                                                $
      Va. 1                                                                                                                              3:2                                                                                                    4:5                                                                                                       6:5
                        p                                                 ppp                  ppp                                                                                 p              ppp                                                                               p
                                                                                                                                                                                                                                                             ********************** + ************$ & $ &
                                                                                                                '( ('                                                                                                                                                                                 '( '(
                                                                                                   + ******* $ & $ &                                   + ************ +! )*&*"*****                                                                        +                                                                                                                             + ************
                                                                                                                                                                                                                                                        '(
                                                                $                                                                                                     $
                                                                                                                                         3:2                                                                                                    4:5                                                                                                       6:5
                                                                                                                                                                                                                                                                                                   $
      Va. 2                                                                                                                4:5
                        ppp                         p                           ppp                                                                                                p                                                                                                               ppp p
                                                                                                                                                                                                                               ****************
                                                                                                                                                                                                                $ & $ & ! +$ & "
                                                                                                                                                                                                                    '( '(
                                                                                 +! )*&*"********************* $ &                                                                                                                                                                                  + *******
                                                                                                                   '(                                                                                                                                                                                                                                                     '(
                                                                                                                                                                                                                             $         $                                                           $
                                                                                                                           4:5
Figure 1. Page 8 from Josiah Wolf Oberholtzer’s Invisible Cities (ii): Armilla for two violas (2015), created with tools
extending Abjad. Source for this score is available at https://github.com/josiah-wolf-oberholtzer/armilla.
                                                                                                                                                                           168
 Acknowledgements                                                 [12] L. Polansky, A. Barnett, and M. Winter, “A Few More
                                                                       Words About James Tenney: Dissonant Counterpoint
 Our sincere thanks go out to all of Abjad’s users and devel-
                                                                       and Statistical Feedback,” Journal of Mathematics and
 opers for their comments and contributions to the code. We
                                                                       Music, vol. 5, no. 2, pp. 63–82, 2011.
 would also like to thank everyone behind the LilyPond and
 Python projects, as well as the wider open-source commu-         [13] K. Ebcioglu, “Computer Counterpoint,” Proceedings of
 nity, for fostering the tools that make Abjad possible.               International Computer Music Conference, 1980.
                                                                  [14] A. F. Melo and G. Wiggins, “A Connection-
                    10. REFERENCES                                     ist Approach to Driving Chord Progressions
 [1] L. Polansky, M. McKinney, and B. E.-A. M. Studio,                 Using Tension,” in Proceedings of the AISB,
     “Morphological Mutation Functions,” in Proceedings of             vol. 3, no. 1988, 2003. [Online]. Available:
     the International Computer Music Conference, 1991,                http://citeseerx.ist.psu.edu/viewdoc/download?
     pp. 234–41.                                                       doi=10.1.1.115.9086&rep=rep1&type=pdf
 [2] Y. Uno and R. Huebscher, “Temporal-Gestalt                   [15] G. Wiggins, “Automated Generation of Musical
     Segmentation-Extensions for Compound Monophonic                   Harmony: What’s Missing?” in Proceedings of the
     and Simple Polyphonic Musical Contexts: Application               International Joint Conference on Artificial Intelligence,
     to Works by Cage, Boulez, Babbitt, Xenakis and Ligeti,”           1999. [Online]. Available: http://www.doc.gold.ac.
     in Proceedings of the International Computer Music                uk/∼mas02gw/papers/IJCAI99b.pdf
     Conference, 1994, p. 7.
                                                                  [16] C. D. Foster, “A Consonance Dissonance Algorithm
 [3] C. Dobrian, “Algorithmic Generation of Temporal                   for Intervals,” Proceedings of International Computer
     Forms: Hierarchical Organization of Stasis and Tran-              Music Conference, 1995.
     sition,” in Proceedings of the International Computer        [17] D. Hornel, “SYSTHEMA - Analysis and Automatic
     Music Conference, 1995.                                           Synthesis of Classical Themes,” Proceedings of Inter-
 [4] S. Abrams, D. V. Oppenheim, D. Pazel, J. Wright et al.,           national Computer Music Conference, 1993.
     “Higher-level Composition Control in Music Sketcher:         [18] M. Smith and S. Holland, “An AI Tool for Analysis and
     Modifiers and Smart Harmony,” in Proceedings of the               Generation of Melodies,” Proceedings of International
     International Computer Music Conference, 1999.                    Computer Music Conference, 1992.
 [5] M.-J. Yoo and I.-K. Lee, “Musical Tension Curves and         [19] M. Hamanaka, K. Hirata, and S. Tojo, “Automatic Gen-
     its Applications,” Proceedings of International Com-              eration of Metrical Structure Based on GTTM,” Pro-
     puter Music Conference, 2006.                                     ceedings of International Computer Music Conference,
 [6] S. Horenstein, “Understanding Supersaturation : A                 2005.
     Musical Phenomenon Affecting Perceived Time,” Pro-           [20] P. Nauert, “Division- and Addition-based Mod-
     ceedings of International Computer Music Conference,              els of Rhythm in a Computer-Assisted Composi-
     2004.                                                             tion System,” Computer Music Journal, vol. 31,
 [7] G. Boenn, M. Brain, M. De Vos, and et. al., “Anton:               no. 4, pp. 59–70, Dec. 2007. [Online]. Avail-
     Composing Logic and Logic Composing,” in Logic Pro-               able: http://www.mitpressjournals.org/doi/abs/
                                                                       10.1162/comj.2007.31.4.59
     gramming and Nonmonotonic Reasoning. Springer,
     2009, pp. 542–547.                                           [21] B. Degazio, “A Computer-based Editor for Lerdahl
                                                                       and Jackendoff’s Rhythmic Structures,” Proceedings
 [8] M. E. Bell, “A MAX Counterpoint Generator for Simu-
                                                                       of International Computer Music Conference, 1996.
     lating Stylistic Traits of Stravinsky, Bartok, and Other
     Composers,” Proceedings of International Computer            [22] N. Collins, “A Microtonal Tempo Canon Generator
     Music Conference, 1995.                                           After Nancarrow and Jaffe,” in Proceedings of the Inter-
                                                                       national Computer Music Conference, 2003.
 [9] M. Farbood and B. Schoner, “Analysis and Synthesis
     of Palestrina-style Counterpoint using Markov Chains,”       [23] I. Xenakis, “More Thorough Stochastic Music,” Pro-
     in Proceedings of the International Computer Music                ceedings of International Computer Music Conference,
     Conference, 2001, pp. 471–474.                                    1991.
[10] D. Cope, “Computer Analysis and Computation Using            [24] D. P. Creasey, D. M. Howard, and A. M. Tyrrell, “The
     Atonal Voice-Leading Techniques,” Perspectives of                 Timbral Object - An Alternative Route to the Control of
     New Music, vol. 40, no. 1, pp. 121–146, 2002. [Online].           Timbre Space,” Proceedings of International Computer
     Available: http://www.jstor.org/stable/833550                     Music Conference, 1996.
[11] M. Laurson and M. Kuuskankare, “Extensible Con-              [25] N. Osaka, “Toward Construction of a Timbre Theory
     straint Syntax Through Score Accessors,” in Journées             for Music Composition Composition,” Proceedings of
     d’Informatique Musicale, 2005, pp. 27–32.                         International Computer Music Conference, 2004.
                                                                169
[26] J. C. Seymour, “Computer-assisted Composition in            [41] M. Laurson, M. Kuuskankare, and V. Norilo, “An
     Equal Tunings: Tonal Congnition and the Thirteen Tone            Overview of PWGL, a Visual Programming Environ-
     March,” Proceedings of International Computer Music              ment for Music,” Computer Music Journal, vol. 33,
     Conference, 2007.                                                no. 1, pp. 19–31, 2009.
[27] A. Gräf, “On Musical Scale Rationalization,” Pro-          [42] A. Agostini and D. Ghisi, “Real-time Computer-aided
     ceedings of International Computer Music Conference,             Composition with BACH,” Contemporary Music Re-
     2006.                                                            view, vol. 32, no. 1, pp. 41–48, 2013.
[28] C. Ariza, “Ornament as Data Structure : An Algorithmic      [43] L. Polansky, “HMSL (Hierarchical Music Specification
     Model Based on Micro-Rhythms of Csángó Laments                 Language): A Theoretical Overview,” Perspectives of
     and Funeral Music Music of the Csángó,” Proceedings            New Music, vol. 28, no. 2, 1990.
     of International Computer Music Conference, 2003.
                                                                 [44] N. Didkovsky and P. Burk, “Java Music Specification
[29] W. Chico-Töpfer,     “AVA: An Experimental,                     Language, an introduction and overview,” in Proceed-
     Grammar/Case-based Composition System to                         ings of the International Computer Music Conference,
     Variate Music Automatically Through the Generation               2001, pp. 123–126.
     of Scheme Series,” Proceedings of International
     Computer Music Conference, 1998.                            [45] H. Taube, “Common Music: A Music Composition Lan-
                                                                      guage in Common Lisp and CLOS,” Computer Music
[30] N. Collins, “Musical Form and Algorithmic Composi-               Journal, pp. 21–32, 1991.
     tion,” Contemporary Music Review, vol. 28, no. 1, pp.
     103–114, Feb. 2009.                                         [46] C. Ariza, “An Open Design for Computer-aided Algo-
                                                                      rithmic Composition: athenaCL,” Ph.D. dissertation,
[31] J.-C. Derniame, B. A. Kaba, and D. Wastell, Software
                                                                      New York University, 2005. [Online]. Available: http:
     Process: Principles, Methodology, and Technology.
                                                                      //books.google.com/books?hl=en&lr=&id=
     Springer, 1999.
                                                                      XukW-mq76mcC&oi=fnd&pg=PR3&dq=An+
[32] I. Xenakis, Formalized Music: Thought and Mathemat-              Open+Design+for+Computer-Aided+Algorithmic+
     ics in Composition. Pendragon Press, 1992.                       Composition:+athenacl&ots=bHedXym8ZP&
                                                                      sig=9i2RQINqIVr2Y7sjxeD9e74myxA
[33] L. Smith, “SCORE- A Musician’s Approach to Com-
     puter Music,” Journal of the Audio Engineering Society,     [47] P. Berg, “PILE - A Language for Sound Synthesis,”
     vol. 20, no. 1, pp. 7–14, 1972.                                  Computer Music Journal, vol. 3, no. 1, pp. 30–41, 1979.
                                                                      [Online]. Available: http://www.jstor.org/stable/
[34] H.-W. Nienhuys and J. Nieuwenhuizen, “LilyPond, A
                                                                      3679754
     System for Automated Music Engraving,” in Proceed-
     ings of the XIV Colloquium on Musical Informatics (XIV      [48] E. Selfridge-Field, Beyond MIDI: The Handbook of
     CIM 2003). Citeseer, 2003, pp. 167–172.                          Musical Codes. The MIT Press, 1997.
[35] H. H. Hoos, K. Hamel, K. Renz, and J. Kilian, “The          [49] NIFF Consortium, et al., “NIFF 6a: Notation Inter-
     GUIDO Notation Format- A Novel Approach for Ade-                 change File Format,” NIFF Consortium, Tech. Rep.,
     quatly Representing Score-level Music,” Proceedings              1995.
     of International Computer Music Conference, 1998.
                                                                 [50] M. Good, “MusicXML for Notation and Analysis,” in
[36] K. Hamel, “NoteAbility Reference Manual,” 1997.                  The Virtual Score: Representation, Retrieval, Restora-
[37] D. Psenicka, “FOMUS , a Music Notation Software                  tion, ser. Computing in Musicology, W. B. Hewlett and
     Package for Computer Music Composers,” Proceedings               E. Selfridge-Field, Eds. MIT Press, 2001, no. 12, pp.
     of the International Computer Music Conference, pp.              113–124.
     75–78, 2006.
                                                                 [51] J. R. Trevino, “Compositional and Analytic Applica-
[38] ——, “Automated Score Generation with FOMUS,”                     tions of Automated Music Notation via Object-oriented
     Proceedings of the International Computer Music Con-             Programming,” Ph.D. dissertation, University of Cali-
     ference, pp. 69–72, 2009.                                        fornia, San Diego, 2013.
[39] AVID. Plugins for Sibelius. [Online]. Avail-                [52] C. Ariza and M. Cuthbert, “Modeling Beats, Accents,
     able: http://www.sibelius.com/download/plugins/                  Beams, and Time Signatures Hierarchically with
     index.html?help=write                                            Music21 Meter Objects,” in Proceedings of the
                                                                      International Computer Music Conference, 2010.
[40] G. Assayag, C. Rueda, M. Laurson, C. Agon,                       [Online]. Available: http://web.mit.edu/music21/
     and O. Delerue, “Computer-Assisted Composition at                papers/2010MeterObjects.pdf
     IRCAM: From PatchWork to OpenMusic,” Computer
     Music Journal, vol. 23, no. 3, pp. pp. 59–72, 1999.
     [Online]. Available: http://www.jstor.org/stable/
     3681240
                                                               170
                        THE NOTATION OF DYNAMIC LEVELS IN THE
                         PERFORMANCE OF ELECTRONIC MUSIC
                            ABSTRACT                                           puter music performer (CMP) who will perform the elec-
                                                                               tronics during a concert. The CMP does not need to be
The “sound diffusion” (or “sound projection”), that is,
                                                                               the composer or the first performer of the piece.
“the projection and the spreading of sound in an acoustic
                                                                                   Although this task could be programmed on a com-
space for a group of listeners”[1], of works for solo
                                                                               puter and automated during the concert, a much better
electronics or for acoustic instruments and electronics (so
                                                                               result can be achieved when doing it by ear. The listening
called, “mixed pieces”), has always raised the issue of
                                                                               and musical skills of a human being are, in fact, still
notating the levels to be reproduced during a concert or
                                                                               much superior to what a machine can realize. The sound
the correct balance between the electronics and the in-
                                                                               diffusion can be adapted to the acoustics of the hall, the
struments.
                                                                               properties of the loudspeakers, the whole audio system,
   If, in the last decades, some attempts were made by
                                                                               the relationship between these and the acoustic image of
few composers or computer-music designers, mostly in
                                                                               the instruments on stage, whether they are amplified or
the form of scores, none of these managed to establish a
                                                                               not, and, finally, to the emotional reaction of the audi-
common practice. In addition, little theoretical work has
                                                                               ence.
been done so far to address the performative aspects of a
                                                                                   As a consequence, most of the time, the dynamic lev-
piece, that is, to provide just the useful information to the
                                                                               els are controlled by ear (and by hand) by the CMP or the
person in charge of the sound diffusion.
                                                                               composer. Often they are only roughly sketched on the
   Through the discussion of three historical examples
                                                                               score. If a faithful recording will certainly help as a refer-
and the analysis of two experiences we developed, we
                                                                               ence, the information is usually insufficient, especially in
will try to identify some possibly general solutions that
                                                                               the case of particular spatial configurations that cannot be
could be adopted independently on the aesthetic or tech-
                                                                               reproduced by a stereo recording.
nological choices of a given piece.
                                                                                   Therefore, the most effective solution is to notate all
                                                                               the information about the sound diffusion directly on the
        1. PRELIMINARY CONSIDERATIONS
                                                                               score that will be used during the performance.
The notation of electronic music has generated only few,                           To delimit our scope, we will concentrate on the nota-
often partial, essays. Most of the literature is either quite                  tion of dynamic levels and will not tackle the issue of
theoretical [2], or it delves into the automated translation                   notating other parameters used for real-time sound pro-
of electronic sounds into a sort of graphical score, such as                   cessing, such as, for instance, the transposition factor of a
in [3]. These experiments were mainly aimed at provid-                         harmonizer.
ing ways to analyse purely electronic pieces more deeply
than when simply listening to them, to account for the                         1.1 Levels vs. loudness vs. musical dynamics
compositional process, or as an attempt to digitally pre-
                                                                               Objectively, levels are normally expressed in decibels, a
serve and archive cultural assets [4].
                                                                               logarithmic unit that is related to the ratio between the
    To our knowledge, little theoretical work has been
                                                                               value of a given and of a reference sound pressure (usual-
done to tackle the more general issue of how to notate
                                                                               ly, either the threshold of audibility, or the maximum
dynamic levels on a score that is to be read by the com-
                                                                               available value in a given system)1.
Copyright: © 2015 Carlo Laurenzi et al. This is an open-access article             However, there are other ways to do it: from the point
distributed under the terms of the Creative Commons Attribu-                   of view of the perception, the dynamic levels are called
tion License 3.0 Unported, which permits unrestricted use,
                                                                               “loudness” and use phons (a unit that takes into account
distribution, and reproduction in any medium, provided the original
                                                                               1
author and source are credited.                                                  See http://en.wikipedia.org/wiki/Decibel (accessed
                                                                               3/10/2015)
                                                                         171
the psycho-acoustic effect of the equal-loudness curves                           These strategies clearly suggest that a compromise be-
ISO 226:2003)2; from a musical point of view, levels are                      tween space or information economy and score readabil-
called “dynamics” and use symbols such as ff, mf, pp.                         ity need to be found. Their usage also depends on the
    Three important factors need to be taken into account:                    nature of the required movements: simply raising a fader
first, the same musical dynamics played by different                          to a given static level does not require the same precision
instruments, or in different ranges of the same instrument,                   as a jagged change over a longer period of time.
might yield different objective or perceptual levels; sec-
ond, choices of interpretation play an important role and                              2. THREE HISTORICAL EXAMPLES
produce different absolute levels for the same musical
dynamics, as pointed out in [5]3; third, the perception of                    2.1 K. Stockhausen: Kontakte
an acoustic instrument’s crescendo is always associated                       Kontakte [6] was originally a 4-channel electronic piece
to the production of a richer and broader spectrum, that                      composed in 1958-60 by Karlheinz Stockhausen. Soon
is, to a shift of the spectral “centre of gravity” toward a                   after, the composer wrote a version for piano, percussion
higher value. These spectral aspects differ specifically                      and the same 4-channel electronic material. The original
from each instrument and can be easily demonstrated by                        score shows one of the first, composer-written, attempts
recording three sound files at three different dynamics                       to graphically notate the electronic material using uncon-
(say, pp, mf, ff), clean them from background noises and                      ventional, graphical signs. The second edition, published
finally normalize them. Even though they have the same                        in 2008, adds some hints at the balance between the am-
maximum amplitude, their dynamics can be easily and                           plified instruments and the electronics. In Figure 2, a +
correctly identified.                                                         above the piano means that the level of the amplification
     Hence, simply raising a fader will not be sufficient to                  of that instrument should be raised until N (normal) is
convey a real feeling of crescendo, but rather of a sound                     found.
getting closer. When notating levels into a performance
score, which unit should be used: dBs, loudness or musi-
cal dynamics?
2
  See http://en.wikipedia.org/wiki/Phon (accessed
3/10/2015)
3
  “The absolute meanings of dynamic markings change, depend on the
intended (score defined) and projected (actual) dynamic levels of the
surrounding context”, [5] abstract.
                                                                        172
                                                                              diffusion, added only in the second edition, is limited to
                                                                              +, - and N (normal) signs. However, this is already suffi-
                                                                              cient to have an idea of the sound diffusion.
                                                                        173
                                                                            stance, in Nono’s works, and, hence, do not need to be
                                                                            notated in the same detailed way.
3. HYPOTHESES
                                                                      174
3.1.2 Spatial taxonomy: space families                                namically varies during the performance. The spatial
During the composition, Stroppa organised space into a                projection hence highlights the frequently used “spiral-
personal taxonomy made of three space families: points                like” materials, characterized by musical figures that
(P), surfaces (S) and diffused space (D). He then related             present similar musical elements across the instruments at
the six reverbs and the amplified instruments to it. Points           slightly different times.
correspond to the direct amplification of an instrument, to           3.1.3 Notational choices
which correspond only one or two loudspeakers depend-
                                                                      Given these preliminary factors, and after 25 years of
ing on the setup (Figure 9)
                                                                      performance experience, a definitive musical score for
                                                                      the electronics was established and written immediately
                                                                      below the instrumental parts.
                                                                          We decided to notate the composed spatial taxonomy
                                                                      directly, by associating a symbol (P, S or D) and a colour
                                                                      (blue, green or red) to each family. The other parameters
                                                                      (spatial width and reverb time) are automatized in An-
                                                                      tescofo, but their change is mentioned above the instru-
                                                                      mental score, near the event name (see Figure 11,
                                                                      e.254.1-2), since this proved to be a useful reminder for
                                                                      the CMP.
                                                                      3.1.4 Reference Level
                                                                      Our hypothesis for notating the dynamic levels is based
                                                                      on the crucial notion of “Reference Level” (RefLev). The
                                                                      RefLev is a perceptual, empirically established value. It
                                                                      depends not only on the audio setup and the characteris-
                                                                      tics of the hall, but also on the aesthetical preferences of
Figure 9. Points: double amplified quartet (6 loudspea-kers)
                                                                      the CMP. We define the RefLev as the level at which the
    Surfaces use only the early reflections and cluster               points (the directly amplified instruments) sound “natu-
stages of reverberation. At each point two adjacent loud-             rally amplified” in the hall and balanced between each
speakers are added, providing a certain spread (called                other.
“width”) to the sound image. The control of the width                     Once the RefLev for the points is specified, the Re-
size is automated during the performance (Figure 10).                 fLev for the other spaces is defined as the level at which
                                                                      they sound “naturally balanced” with the points.
                                                                          When all the RefLev’s are setup, the same physical
                                                                      position of the faders should sound equally loud, in spite
                                                                      of the differences (size of the instrument, position and
                                                                      type of microphones, nature of the spaces, and so on) for
                                                                      all the spaces5. This is, of course, a very personal estima-
                                                                      tion, as it is not easy to compare, for instance, the sound
                                                                      of an amplified violin, coming from one loudspeaker,
                                                                      with a reverberated sound of a cello coming from all the
                                                                      loudspeakers.
                                                                          At the beginning of the rehearsals, the RefLev’s must
                                                                      be empirically and precisely set up. In the score, they are
Figure 10. Surface: width spread for the viola and cello.             notated with the letter “N” (normal). Notice that the same
     Finally, the diffused spaces only use the late reverber-         RefLev may produce a very loud sound, if the musicians
ance, and produce a sound that seems to come from eve-                are playing fff, or a very soft sound, if they are playing
rywhere or… nowhere!                                                  ppp.
     In the performance score, each instrument is consid-
ered as one of the voices of the electronics, and is “spa-
tially orchestrated” by the CMP, that is, sent to one or
another spatial family depending on what is being played.
                                                                      5
The final result is an augmented sound image that is not                This position, as well as the dynamic curves, can be defined
                                                                      by the user in the patch, but, usually, it is located at about ¾ of a
only much larger and deeper than usual, but it also dy-               fader’s length.
                                                                175
3.1.5 Level changes                                                          coloured PDF file and performing Spirali reading the
Once the RefLev’s are defined, all the other levels are                      score on a computer or a tablet already seems very rea-
notated as a dynamic difference with respect to them and                     sonable.
marked with 1 to 3 “+” or “-” (that is, for instance, “+++”                     Notice that the acoustic string quartet should not be
or “- -”). They are defined as three clearly different and                   aware of what is going on in the space, as the spatial
perceptible dynamic layers: one +/– means slightly loud-                     changes risk to negatively influence the quality and accu-
er/softer than the RefLev, two +/– means clearly louder                      racy of the interpretation. It just has to play!
or softer, three +/– are extreme levels, from macro-
                                                                             3.2 Levels of sound synthesis: the case of Traiettoria
amplified to barely amplified.
    These levels are not absolute, but rather correspond to                  3.2.1 Setup
perceptual areas, and will, therefore, vary during the                       Traiettoria [11] is a 45’ long cycle of three pieces for
piece as a function of what kind of music is being per-                      piano and computer-synthesized sounds written by M.
formed. They indicate subjectively different “steps” in                      Stroppa in the early 80s.
the amplification process: seven dynamic steps were                              The electronics is solely made of eight stereo sound
considered as necessary and sufficient to accurately per-                    files (from ca. 3’ to 7’ long), which exclusively use addi-
form the sound diffusion of Spirali.                                         tive synthesis and frequency modulation, with no refer-
    Since the changes between levels are not very com-                       ence to the piano’s spectral structure. A strong connec-
plex, the traditional signs of cresc. and dim. were adopt-                   tion with the instrument is established by “tuning” the
ed, because they are expressive, use a space in the score                    electronic material to some harmonic structures played by
that does not depend on the dynamic range and allow for                      the piano. The integration between the synthetic and the
the notation of a duration (see Figure 11).                                  acoustic materials is very deeply structured, and can
3.1.6 Final score                                                            produce a compelling fusion, if the electronics is correct-
                                                                             ly performed!
Placed below the instrumental score, once the preliminary
                                                                                 The piano and the electronics are loosely synchro-
choices are clear, the notation of the electronics is quite
                                                                             nised by means of temporal pivots [12].
straightforward (Figure 11).
                                                                             3.2.2 Spatial families
                                                                             The sound diffusion of Traiettoria is composed of two
                                                                             main spaces:
                                                                                 a. a reduced space, made of the amplified piano (2
                                                                                     loudspeakers placed near the instrument) and of
                                                                                     one loudspeaker facing the piano’s sound board
                                                                                     and placed under the instrument, from which a
                                                                                     mono version of the electronics is diffused, so as
                                                                                     to sympathetically interfere with the resonating
                                                                                     strings.
                                                                                 b. an enlarged space, around the audience, unique-
                                                                                     ly reserved to the electronic sounds.
Figure 11. Spirali: manuscript score, p. 58 (© Casa Ricordi, by kind
permission ).
                                                                                The constitution of the enlarged space was not speci-
    The usage of colours to identify the different spatial                   fied in the original score, and could span from two loud-
families turned to be a very important ergonomic feature,                    speakers behind the audience to a whole Acousmonium6.
in order to improve the readability of the score. The rela-                  Ideally, the more loudspeakers are at avail, the more
tion between the notation and the physical gestures need-                    dimensions the enlarged space may have, and, therefore,
ed to operate the control faders becomes more straight-                      the more subtle and expressive the spatial nuances can be.
forward and faster to learn.                                                 But the difficulty of the electronic performance is signifi-
    In addition, the isolation of single elements in the in-                 cantly increased!
strumental score, using the same colour as the space they                        After several decades of experience, and thanks to the
belong to, helps to focus on the correct timing and action                   work of Carlo Laurenzi at Ircam, the electronics was
to perform, especially if the passage is short and/or diffi-                 implemented in Max. As in Spirali, a spatial taxonomy
cult to perform.                                                             was defined, but, this time, only as a result of the perfor-
    Finally, if printing a score in colours is still not very
                                                                             6
diffused, because of the production costs, generating a                       See http://fr.wikipedia.org/wiki/Acousmonium
                                                                             (accessed 1/28/2015).
                                                                       176
mances with several different audio systems and configu-                   spectral centre of gravity toward a higher region together
rations, and not when the piece was composed. Then, a                      with the movement of the faders. This was done with a
suggested, standard taxonomy for the sound diffusion                       HP-filter placed on the electronics’ stereo input moved
was defined: 7 families of spaces (totalling 11 main loud-                 together with the fader.
speakers, see Figure 12). Each family is given a name and                      As impressive as it may look, this notation proved not
a symbol and is controlled by one fader: FC (Front Cen-                    to be very practical for the sound diffusion. It contained
tre), Pf (Piano), U (Under the piano), F[R/L] (Front                       too much information that was not required during a
[Left/Right]), M[L/R] (Middle), R[L/R] (Rear), RC (Rear                    concert and too little information regarding the actual
Centre). It is for this taxonomy that a new notation was                   spreading of sound.
established.                                                                   Finally, its “orchestral” appearance made it difficult
                                                                           for the pianist to grasp which sounds are easier to hear,
3.2.3 Notational choices
                                                                           and therefore to visually identify the essential cues corre-
When Traiettoria…deviata was first published, it was                       sponding to the temporal pivots to which the performance
provided with a unique, exhaustive notation of the syn-                    had to be synchronised. A more pragmatic and expres-
thetic sounds [13], a simple notation of the two main                      sively efficient solution had to be found.
diffusion spaces (M=under the piano, D/S = left/right)
and a double time staff (Tpo, Figure 13). The absolute                     3.2.4 Reference Level
times placed in the middle of the time staves are temporal                 Based on our experience with Spirali, we defined a Re-
pivots, the other markings belong to either the piano or                   fLev for Traiettoria as the subjective level at which the
the electronics7.                                                          piano sounds “naturally amplified”, and the electronics
                                                                           “naturally balanced” with it. However, here, it did not
                                                                           seem necessary to explicitly mark it in the score (with N).
                                                                           Three degrees of +/- indicate, as in Spirali, six perceptu-
                                                                           ally different dynamics for the piano or the electronics.
                                                                     177
    During the performance of Traiettoria, the most diffi-            (or the couple of loudspeakers) that is heard as the main
cult task is to find a musical balance between the sound              source of diffusion.
in the hall and the piano (and some electronics) on stage.                It is, however, always possible, depending on the
How to compare, for instance, an electronic sound com-                characteristics of the hall or personal taste, to enlarge the
ing from behind the audience with the piano? When the                 focus of a single loudspeaker by diffusing the same elec-
same level is indicated in the score, it is the task of the           tronic material into nearby loudspeakers (called “second-
CMP to (subjectively) estimate the correct sound image                ary loudspeakers”), at a softer level, so as to change the
and intensity.                                                        acoustic image of the main loudspeaker, without directly
                                                                      perceiving the other ones.
3.2.5 Composition of the sound diffusion
                                                                          Being rather a performer’s aesthetical choice, we de-
Even though, in theory, there are as many ways to per-                cided not to notate this sound-diffusion technique, except
form the sound diffusion of Traiettoria as there are con-             when it had a compositional role.
certs, the practical experience showed that some strate-
                                                                      3.2.8 Score
gies were more musical and tended to be regularly re-
peated.                                                               The final score is still under preparation, but concrete
    In the tradition of the acousmatic music, the sound               experiments and current sketches showed that simply
diffusion is thought as a real orchestration of the electron-         notating the levels above the piano part was not sufficient
ic voices over a moving, imaginary space. Stroppa com-                to achieve a good performance and efficiently learning
posed a precise hierarchy that organises not only the                 from the score.
audio setup, but also the spatial form of Traiettoria.                    After some tests, we found that adding a sonogram
    For instance, Traiettoria…deviata starts with a barely            window of a mono mix of the synthetic sounds on top of
amplified piano that gets increasingly louder, that is,               the page was the best choice to correctly perform the
more amplified. This yields a larger and larger sound                 electronics.
image. When the electronics joins in, it fades into the                   Even if a sonogram is very concise and cannot pre-
piano’s decaying resonance, and comes out only from U                 cisely represent pitched and rhythmic material, the most
(see 3.2.2). Little by little, the constricted space of the           important temporal elements are still clearly identifiable
electronics opens up to the Pf and the F groups, thus                 and help both performers to follow the spectro-
unfolding its image around the piano. It is only at 1’57              morphological unfolding of the electronics. And if some
that the R group is activated. A detailed analysis of the             special pitch or rhythmic structures need to be marked, it
spatial form of the sound diffusion of Traiettoria is be-             is always possible to locally add this information on the
yond the score of this text, but it is important to remark            sonogram or between it and the dynamic levels.
that, since it is an important part of the composition of the             Thanks to the very explicit images of the sonogram of
piece, it needs to be precisely and correctly notated.                synthetic sounds, learning the correct synchronization is
    Each spatial group is represented by one fader on the             no longer difficult (Figure 14).
control interface8 and by one vertical position in the                   When dealing with several sound files that are inher-
score. Since each group is identified by a letter, it needs           ently unbalanced9, the sound diffusion can become a
to appear in the score only when it is active. In this way,           tedious and cumbersome task, as each new sound would
the usage of the space within the page is more efficient.             require a different position of the fader to compensate the
                                                                      inherent lack of balance.
3.2.6 Level changes
                                                                         To avoid this problem, a special solution, called “rela-
It did not seem necessary to find a more refined way to               tive faders” (RelFad) was implemented in all the patches
notate level changes than what was used in Spirali. In the            for Stroppa’s electronic works. Before being multiplied
few moments, where a random spread is needed, it is                   by the value corresponding to the position on the control
directly asked for by some text written in the score and              interface, each RelFad is first multiplied by a value writ-
each CMP can freely choose how to perform it.                         ten in the Antescofo score. In this case, if the written
3.2.7 Main/Secondary loudspeaker(s)                                   values are just right, it is enough to keep the fader at its
                                                                      neutral value (1.0). However, if unpredictable circum-
Together with the taxonomy explained in 3.2.2, the sound              stances modify the perception of the diffused sounds, the
diffusion of Traiettoria extends the concept of loud-                 RelFad can still be moved away from its neutral value.
speaker. Each spatial family, identified by a letter, repre-
sents the “main loudspeaker”, defined as the loudspeaker
                                                                      9
                                                                       For instance, because they are synthesized with radically
                                                                      different techniques and have extremely dissimilar spectral
8
    A MIDI mixer or an OSC-driven device, such as an iPad.            contents.
                                                                178
   As a consequence, the movement of faders during the                        must contain all the information needed to interpret the
performance is greatly reduced, and the performance                           piece and accurately represent the time relationships
itself becomes more ergonomic and gesture-effective.                          between the acoustic instrument(s) and the electronics.
The written values lay half way between the realm of the                          It is obvious that such a detailed performance score
composition and of the interpretation and can always be                       needs some time to be learnt and practiced.
very easily changed. One might also imagine to have                               Finally, this score may also have the crucial function,
presets of good values for different acoustical situations.                   not only to effectively transmit precise information about
    Since they were implemented, RelFad’s have greatly                        the sound diffusion to other CMP’s, but especially to
improved the task of learning to perform the electronics                      make it possible to understand how to render a complex
of a mixed piece, and have helped to spread the sound                         orchestration of synchronized spatial events between
diffusion technique to a larger community of CMP’s.                           electronics and instruments.
                                                                                  Due to the complexity of the music and the amount of
                                                                              actions involved in the sound diffusion, learning the score
                                                                              by heart rapidly became a necessity. However, the per-
                                                                              formance score was still extremely useful during the
                                                                              learning phase and the rehearsals.
                                                                                                5. CONCLUSIONS
                                                                              Our experience has shown that it is possible to find gen-
                                                                              eralized and efficient symbols to notate the sound diffu-
                                                                              sion of electronic works, if it is not automated.
                                                                              Our first step was to identify a spatial taxonomy adapted
                                                                              to a given piece, in order to find an intermediate layer of
                                                                              notation between the compositional concepts, the perfor-
                                                                              mance needs and the physical audio setup.
                                                                              The next step was to define the meaning and the value of
                                                                              a RefLev for each situation and to notate all the other
                                                                              relative dynamic changes with respect to this subjective
                                                                              value. Introducing RelFad’s also greatly improved the
                                                                              gestural aspects of a performance.
                                                                              Our next step will be to extend this experience to the
                                                                              control of real-time treatments.
                                                                              Acknowledgments
                                                                              This research was partially made during Stroppa’s sab-
                                                                              batical leave from the Hochschule für Musik und Darstel-
                                                                              lende Kunst in Stuttgart.
Figure 14. Traiettoria : sketch of the new electronic score. Relative
faders
                                                                                                 6. REFERENCES
                     4. CURRENT STATE                                         [1] L. Austin, “Sound Diffusion in Composition and
                                                                                  Performance: An Interview with Denis Smalley” in
The notation of dynamic levels in the performance scores
                                                                                  Computer Music Journal, vol. 24, no. 2, pp. 10–21,
of Spirali and Traiettoria was inspired by the late Nono’s                        2000.
works, but the musical context is very different and has a
totally diverse goal.                                                         [2] H. Eimert, F. Enkel, and K. Stockhausen, “Fragen
                                                                                  der Notation Elektronischer Musik” in Technische
    In Nono’s works the notation was intended to approx-
                                                                                  Hausmitteilungen        des      Nordwestdeutschen
imately indicate the behaviour of the levels, in order to
                                                                                  Rundfunks, vol. 6, pp. 52-54, 1954.
provide a schematic structure for the performance of
pieces which allowed for a certain degree of improvisa-                       [3] G. Haus, “EMPS: A System for Graphic
tion from both the instrumental and electronic parts.                             Transcription of Electronic Music Scores,” in
    Stroppa, on the other hand, intends to confer a much                          Computer Music Journal, vol. 7, no. 3, pp. 31–36,
                                                                                  1983.
higher responsibility to role of the CMP, who is required
to possess a performance skill comparable to that of an                       [4] N. Bernardini, A. Vidolin, “Sustainable Live
instrumentalist. For this reason, the performance score                           Electro-acoustic Music”, in Proceedings of the
                                                                        179
    International Sound and Music Computing
    Conference, 2005.
    http://server.smcnetwork.org/files/proc
    eedings/2005/Bernardini-Vidolin-SMC05-
    0.8-FINAL.pdf (accessed Jan. 28, 2015)
                                                           180
               AUTOMATED REPRESENTATIONS OF TEMPORAL
              ASPECTS OF ELECTROACOUSTIC MUSIC : RECENT
                EXPERIMENTS USING PERCEPTUAL MODELS
                                                              Dr David Hirst
                                                       School of Contemporary Music
                                            Faculty of VCA and MCM, University of Melbourne
                                                       [email protected]
                                                                          181
on the spectrogram and it was possible to get a time-                             seems to elicit expectation in the listener that an
stamped listing of the annotation layer, using the program                        important event is about to occur. But how can we
Sonic Visualiser [2], which was then imported into a                              measure and, even better, display activity within a work ?
spreadsheet program (Microsoft Excel) and printed as a                            Well the Sonic Visualiser program provides access to a
listing of all the annotations. The visual screens and                            suite of plugins of signal analysis. In the Unsound
printed time-stamped sound object listings became the                             Objects article, I postulated that the type of analysis that
data that facilitated detailed identification and                                 seems to correlate best with sound object activity is a plot
specification of sound events within the aurally identified                       of “spectral irregularity” versus time.
sections of the work.                                                                There are several different methods for calculating the
    The next phase of the analysis involved moving                                irregularity present within a spectrum, but essentially
beyond the acoustic surface to examine structures,                                they both give a measure of the degree of variation of the
functions and motions between sound events. By                                    successive peaks of the spectrum. Jensen, for example,
“zooming out” to look at longer sections of the work, or                          calculates the sum of the square of the difference in
carrying out “time-span reduction”, we can observe                                amplitude between adjoining partials [3]. What I am
changing sonic patterns over the course of the work. We                           postulating here is that where there is a large variation
can look at the different sections and ask questions like:                        across the spectrum, partial to partial, then this can
What propels the work along from moment to moment,                                provide us with a depiction of a high degree of activity.
section to section, or scene to scene ? To help answer this                       Figure 1 depicts a spectral irregularity plot for the whole
question, we can observe that an increase in sonic activity                       of Unsound Objects.
Figure 2 : Plot of Inter-onset Time vs Time (secs) for the whole of Unsound Objects.
Figure 3. Plot of Inter-onset Rate vs Time (secs) for the whole of Unsound Objects.
                                                                            182
   The analysis of Unsound Objects then combined the                 but with some low level noise swells failing to trigger the
use of spectral irregularity plots with aurally identified           onset detector (#3 above).
sections, within the work, to provide a detailed analysis               The “time instants” data (#3 above) was exported, then
of “activity” and to tabulate “sound types” for each                 imported into an Excel spreadsheet in order to be able to
section. This table showed “activity amount and type”                make further calculations such as “inter-onset times” (the
and “selected sound object types”. The work actually                 time between onsets). Figure 2 shows a plot of Inter-onset
divides into two main halves and after the two halves                Time versus Time for the whole of Unsound Objects. Its
were compared, a summary of sonic archetypes (in the                 peaks show us where there are long breaks in the work,
form of mimetic archetypes and structural archetypes),               and give a pointer to how the work may be divided up in
sound transformations, functional relations, and sonic               analysis.
activity were discussed.                                                Displaying time instants, however, only progresses us
                                                                     part of the way to obtaining a measure of event “activity”.
Determining Activity                                                 Inter-onset “rate” was then calculated and plotted, as
The aim of the next study [4] was to seek an alternative             shown in Figure 3. This provides us with a measure of the
method to the use of “spectral irregularity” for measuring           number of onsets per second, which, in turn, provides a
activity in electroacoustic music.                                   guide to the amount of event initiation activity at a
   In essence, activity could be defined as the number of            particular time within the work.
sound events in a given time period. Therefore we are
interested in the onset time of each sound event, and its            Implications of Activity Plots
duration. Let’s start with onset time. What signal analysis          Determining inter-onset time can give us a plot (Figure 2)
tools exist for determining sound event onset time within            that is useful in showing the main sections within a work.
a musical work ?                                                     Calculating its reciprocal, inter-onset rate can generate a
   The program Sonic Visualiser, has a number of tools               graph that provides some measure of the varying activity
within it to perform such an analysis. Aubio onset                   within an electroacoustic work (Figure 3). If we had
detection (aubio.org) has eight different types which                graphed Figure 3 at the beginning of the analysis, we
all produce a single list of time “instants” (vertical lines         would have observed that the piece does divide into two,
when plotted) of individual start times. This output can be          with little activity between about 390 and 410 seconds.
exported to a spreadsheet. Their algorithm can be varied             The first half begins with three bursts of activity,
to suit the source material. The Queen Mary, University              followed by a longer, more active phase of increasing
of London, in-built Sonic Visualiser onset detection                 activity until the “mid-break”. The second half is more
algorithm lists three types of onset detector, but these are         continuously active until around 660 seconds, where the
just the one detector with lots of variables: Program;               work has several less active periods, perhaps in
Onset detection function type; Onset detection sensitivity;          preparation for the end of the piece.
Adaptive whitening; Channel options for stereo files;                   In the previous analysis of Unsound Objects, sections
Window size; Window increment; and Window shape.                     were first determined aurally, then superimposed over the
Output is an “onset detection function” which is a                   irregularity plot. Comparing the plot of inter-onset rate
probability function of a “note” onset likelihood.                   (Figure 3) with the irregularity plot (Figure 1) we can see
   In developing a method for the detection of onsets in             that the piece appears to be much more active in Figure 3
Unsound Objects, combining several forms of                          than Figure 1, especially in the second half. The question
representation was found to provide a more reliable guide            remains as to which is a better measure of “activity” ?
to data gathering rather than using any single plot. After           The inter-onset rate is probably a more accurate method,
some experimentation, the following combination was                  but it seems exaggerated. This is possibly because it
employed, using the Queen Mary algorithms:                           doesn’t take into account the loudness of the events.
                                                                     Perhaps if this plot (Figure 3) was modified by the RMS
      1. RMS Amplitude.
                                                                     amplitude, then a more useful picture of “effective
      2. Smoothed detection function: Time Values                    activity” may emerge. There are also inherent definition
          (displays probability function of onsets).                 problems for “iterative” sound events, such as drum rolls
      3. Note onsets: Time Instants. Program: Soft                   or machine sounds. Is such a sound type one long event
          Onsets; Onset detection function: Complex                  or many short events ? This phenomenon may skew the
          Domain; Onset detection sensitivity: 60%;                  events per second data.
          Adaptive whitening: Yes.                                      In terms of automating analysis, the inter-onset time
  This resulted in the onsets (#3 above) aligning pretty             plot (Figure 2) is very effective in identifying sections in
well with the smoothed detection probability (#2 above),             a long musical piece, while the inter-onset rate (Figure 3)
                                                               183
does provide a measure of active versus inactive
depiction for various passages in a long piece.
   The next step in this work was to examine activity and
other temporal measures in other works, including more
rhythmical pieces.
      RHYTHMOGRAM REPRESENTATIONS
This section of our paper introduces work that is well
documented in a paper from the ICMC in 2014 [5], but it
will be very briefly summarized here to place our
subsequent work on automated segmentation into a
context of our ongoing work, and to demonstrate some
contrasting and varied representations.                                      Figure 4. Rhythmogram for a repeating pattern of three short 50ms
   Having investigated activity plots, the aim of the next                   tones, followed by a 550ms period of silence.
stage of our work was to continue our Segregation,
                                                                                Notable features of the rhythmogram model are:
Integration, Assimilation, and Meaning (SIAM) approach
                                                                                      Consideration of sensory memory consisting of
of employing a cognitive model [6], in combination with
                                                                                         a short echoic store lasting up to about 200 to
signal processing techniques, to analyse the “raw” audio
                                                                                         300 ms and a long echoic store lasting for
signal, and more specifically, to depict time-related
                                                                                         several seconds or more2.
phenomena (beat, rhythm, accent, meter, phrase, section,
                                                                                      Each filter channel detects peaks in the response
motion, stasis, activity, tension, release, etc.). Such
                                                                                         of the short-term memory units.
depictions should assist or enhance aural analysis of,
                                                                                      The sum of the peaks is accumulated in a
what is essentially, an aural art-form.
                                                                                         simplified model of the long echoic store.
   After an extensive literature search, the use of the
“rhythmogram” in the analysis of speech rhythm, and the                               An “event” activation is associated with the
analysis of some tonal music, seemed to fulfill the                                      number of memory units that have triggered
requirement of a cognition-based method that uses an                                     the peak detector and the height of the memory
audio recording as its input signal to produce a plot of the                             unit responses.
strength of events at certain time points.                                            The hierarchical tree diagrams of Lerdahl and
                                                                                         Jackendoff [12] have visual similarities to
The Rhythmogram                                                                          rhythmogram plots and so rhythmograms may
                                                                                         help the researcher with gaining insights into
    In my ICMC 2014 paper [5], I provided a thorough
                                                                                         the hierarchical structure of a musical work
explanation of the rhythmogram, so I will only briefly
                                                                                         under investigation.
summarise it here. The framework is documented in Todd
                                                                                      Not only does the rhythmogram model detect the
[7], Todd & Brown [8] and Marr [9]. It makes use of
                                                                                         onsets of events, but it can represent other
quite a traditional auditory model where outer and middle
                                                                                         rhythmic grouping structures based on inter-
ear responses are modelled by filtering, then gammatone
                                                                                         onset times, changes in rhythm, and meter.
filters model the basilar membrane. This is followed by
                                                                                      Changing the analysis parameters allows the
the Meddis [10] inner hair cell model, which outputs the
                                                                                         researcher to “zoom in” or “zoom out”, to
auditory nerve firing probability. It is then summed and
                                                                                         focus on short-term rhythmic details, or
processed by a multi-scale Gaussian low-pass filter
                                                                                         provide a representation of an entire section, or
system. Peaks are detected, summed and plotted on a time
                                                                                         even a complete work.
constant versus time graph, resulting in a plot known as a
                                                                                 In the case of the final point above, both of these levels
rhythmogram.1
                                                                             of focus have been explored, and a summarised
    Figure 4 shows an example rhythmogram for a
                                                                             illustration of both short-term and long-term structures
repeating pattern of three short 50ms tones, followed by a
                                                                             will be recapitulated briefly here.
550ms period of silence, lasting 7 seconds.
                                                                             Analysis of Normandeau’s Electroacoustic works
                                                                             This study utilised the MATLAB code written by Guy
                                                                             Brown, and adapted by Vincent Aubanel for the LISTA
1
 A version of Silcock’s schematic [11] for Todd and Brown’s model is
                                                                             2
shown in the Hirst (2014) ICMC paper [5].                                        Todd (1994), pp. 34-35.
                                                                       184
project [13]. The code makes use of the fact that it is                           Labelled as ‘A’ in Figure 5, the tallest spikes
possible to increase the efficiency of the computation and                     correspond with a “low thump”, somewhat like a bass
still obtain a useful, meaningful rhythmogram plot by                          drum. Using these spikes we could even infer a tempo
using a rectified version of the input signal directly, i.e.                   from their regularity. Labelled as ‘B’ and “soft low
bypassing the Gammatone filterbank and inner hair cell                         thumps” in figure 5, these softer peaks (B) are
stages3.                                                                       interspersed between the louder peaks (A) and are
    The electroacoustic works which were chosen for                            equidistant.
analysis in this study are collectively known as Robert                           To summarise our observations further we can note
Normandeau’s Onomatopoeias Cycle, a cycle of four                              that there is a rhythmic background of regular beats,
electroacoustic works dedicated to the voice. The                              consisting of low thumps, arranged in a hierarchy with
Onomatopoeias Cycle consists of four works composed                            softer low thumps interspersed. The “tempo” is around 66
between 1991 and 2009, which share a similar structure                         bpm. An implied duple meter results from the loud-soft
of five sections and are of a similar duration of around 15                    thump beats alternating.
minutes. The works have been documented by Alexa                                  Against this regular background is a foreground of
Woloshyn [14], and by Normandeau himself, in an                                vocal “yow” shouts. Less regular in their placement, the
interview with David Ogborn [15].                                              shouts become elongated to “yeow”, and then amplitude
    Two types of analysis were performed. The first is a                       modulated to add colour and variety. Although less
detailed rhythmic analysis of a short segment of one of                        regular in their placement, the “shouts” always terminate
the works. The second analysis zooms out to examine the                        on a “thump” beat and thereby reinforce the regular
formal structure of three pieces in the cycle and make                         pulse.
comparisons.                                                                      There are finer embellishments too, labelled ‘C’ in
                                                                               figure 5. This third level of spikes in the rhythmogram
Detailed analysis of a short segment of Spleen                                 depicts events that are placed between thump beats and
The work chosen for detailed rhythmic analysis was the                         have a timbre that is somewhere between a saw and a
second work in the cycle called Spleen [16]. This work4                        squeaky gate. I’ll describe these events as “aw” sounds,
was chosen as it has a very distinctive beat in various                        and they function as an upbeat to the main thump beat.
sections and it is slightly unusual for an electroacoustic                     This “one and two and three and four” pattern has a
work in that respect. Figure 5 shows a rhythmogram for                         motoric effect on the passage. The presence of further,
the 13.5 second segment of musique et rythme from                              shorter, and regular spikes is an indication of more sound
Normandeau’s Spleen. The X-axis is time (in secs) and                          events which function to embellish the basic pattern.
the Y-axis is filter number (from 1 to 100). For the full                         Looking at the rhythmogram as a whole, for this
test parameters see [5]. For now we note that the                              passage, we can observe that it tells us there are regular
minimum time constant was 10 msec, and the maximum                             time points in the sound, there is a hierarchy of emphasis
time constant was 500 msec for this test.                                      in the time points (implying some meter), and a further
                                                                               hierarchy in the sense that there is a background of a
                                                                               regular part (the thumps) and a foreground of less regular
                                                                               vocal shouts. Both the background and the foreground
                                                                               have their own embellishments - anticipation of the
                                                                               events in the case of the former, and an increase in length
                                                                               and use of amplitude modulation, in the case of the latter.
                                                                         185
voices. The final piece in the cycle is Palimpseste, from                especially in sections 3, 4 and 5. Éclats is more busy
2005, and it is dedicated to old age. The first three works              than Spleen, which is busier than Le renard et la rose.
were analysed, and rhythmograms were created for them.                   Finally, the contrasts become more exaggerated with each
   As these works are each about 15 minutes long, a                      piece.
different set of analysis parameters was required from the
analysis of just a 13.5 second excerpt. After a lot of                   Remarks About Rhythmograms
experimentation, a suitable set of parameters was found.                    This initial use of the rhythmogram in the analysis of
The reader can see [5] for further details, but                          electroacoustic music has demonstrated that the algorithm
significantly, the minimum time constant was 0.6                         is capable of displaying the temporal organization of a
seconds, and the maximum time constant was 30 seconds.                   short segment with details that may enhance analysis
These parameters represent a “zoomed out” temporal                       through listening. The algorithm is also flexible, given
view of the three pieces.                                                the careful selection of analysis parameters, in the sense
   Figure 6 depicts the rhythmogram (Time vs Filter No.)                 that it can also be used on entire pieces to help elicit
for Éclats de Voix for its full duration of around 15                    information      regarding    more     formal     temporal
minutes. The alternating grey and white areas mark out                   organisational aspects, and to make comparisons with
the five sections that each piece is divided into - as                   other works.
tabulated by Woloshyn in her paper [14].                                    Some of its short-comings are that it can’t solve the
   There is not the space within the confines of this paper              separation problems of polyphonic music, rhythmograms
to show the Rhythmograms for all three Normandeau                        can be awkward to interpret, and they also rely on aural
works in the cycle. Neither is there the space to go into                analysis. Careful selection of analysis parameters is
our detailed findings, however we can make some                          crucial in obtaining meaningful plots.
indicative comparisons in summary here.
   Comparing Spleen with Le renard we observed
similarities between the rhythmic profiles of sections 1, 3,
4 and 5. Comparing the rhythmograms from Éclats de
voix and Spleen, there are some similarities of shape,
Figure 6. Rhythmogram of the whole of Éclats de voix from Normandeau’s Onomatopoeias cycle.
                                                                   186
Automated segmentation Model                                                     Automated Segmentation in Practice Method I
   For large-scale segmentation, a method for media                                  Figures 7 and 8 demonstrate an example of a “novelty
segmentation, proposed by Foote and Cooper [18], was                             curve” and its accompanying segmented audio for the
used as a model. Their method focuses on the notion of                           first 3 minutes of Harrison’s Unsound Objects [19].
self-similarity. Essentially, the spectrum of every time-                        Figure 9 shows the sections derived by a human listener
segment of an audio work is compared with every other                            superimposed over the spectral irregularity plot for the
time-segment spectrum, and a “similarity matrix” is                              same extract of Unsound Objects. Figure 9 is included for
created for the whole work. Foote and Cooper [18]                                the sake of comparison between automated methods and
describe how the work can be divided into sections from                          a human analyst.
the similarity matrix through the construction of a                                  Using this segmentation method, the “kernel size” was
“novelty curve”: ‘To detect segment boundaries in the                            manipulated to produce section lengths approximating the
audio, a Gaussian-tapered “checkerboard” kernel is                               manual analysis. With a kernel size of 1250 samples, 7
correlated along the main diagonal of the similarity                             segments were created in the first 3 minutes.
matrix. Peaks in the correlation indicate locally novel                              Comparing figures 8 and 9 we can observe that
audio, thus we refer to the correlation as a novelty score’.                     automated segments 1 and 2 (Figure 8) match Section 1
   Large peaks detected in the resulting time-indexed                            of the manual analysis pretty well (Figure 9). Similarly
correlation are then labeled as segment boundaries.                              automated segments 3 and 4 seem to match Section 4,
  Foote and Cooper go on to describe how they calculate                          automated 5 and 6 line up with Section 3, and automated
similarity-based clustering to derive the signature of a                         segment 7 matches the manual Section 4. At first glance
musical piece, but our work has only proceeded as far as                         then, this seems quite a useful method of segmentation.
testing the segmentation technique within the                                    However, in deriving this representation, a convolution
electroacoustic musical realm.                                                   computation time of nearly 16 minutes is required for a
                                                                                 “kernel size” of 1250 samples in the similarity matrix
                                                                                 (quite a large kernel size). Clearly a more efficient
                                                                                 method was needed.
Figure 7. Novelty curve for the first 3 minutes of Unsound Objects – Method I.
Figure 8. Audio waveform segmented using the novelty curve for the first 3 minutes of Unsound Objects – Method I.
                                                                        187
Figure 9 : Irregularity plot with section specification notated by a human listener for the first 3 minutes of Unsound Objects.
Figure 10 : Novelty curve for the first 3 minutes of Unsound Objects – Method II.
Figure 11 : Audio waveform segmented using the novelty curve for the first 3 minutes of Unsound Objects – Method II.
Figure 12 : Novelty curve for the first 3 minutes of Unsound Objects – Method II, lower Contrast value.
Figure 13. Audio waveform segmented using the Figure 12 novelty curve – Method II, lower Contrast value.
                                                                          188
Automated Segmentation in Practice Method II                       included, and the audio is segmented into 14 segments
                                                                   (Fig. 13). This provides an effective means to vary
In Method I, segments are determined from peaks in the
                                                                   segmentation from large sections to individual events,
novelty curve. The novelty curve represents the
                                                                   depending on the ‘Contrast’ value. In our examples,
probability along time of the presence of transitions
                                                                   segmentation is on the basis of timbre, however pitch,
between successive states, indicated by peaks, as well as
                                                                   rhythm and meter could also be used.
their relative importance, indicated by the peak heights.
                                                                      In contrast to the 16 minutes required to calculate
For electroacoustic music, we use the spectrum as input
                                                                   segmentation using Method I, Method II is at least four
to the similarity matrix specification routine. The Kernel
                                                                   times faster and more efficient.
based approach is described by Foote and Cooper [18] as
follows: ‘Novelty is traditionally computed by comparing
                                                                                      CONCLUSIONS
– through cross-correlation – local configurations along
the diagonal of the similarity matrix with an ideal                Within this paper we have examined the determination of
Gaussian checkerboard kernel.’ That is, every segment of           a number of temporal-related analytical aspects of
the piece is compared with every other segment to look             Electroacoustic Music, and their representations. We
for similarities and differences. The sequence of                  calculated onset times, inter-onset times, and inter-onset
operations is: audio in - spectrum - similarity matrix -           rate for Harrison’s Unsound Objects. We explored the use
novelty - convolution - peaks - segmented audio display -          of the “rhythmogram” as a means of hierarchical
novelty score display.                                             representation in the works of Normandeau’s
  Method II makes use of the simpler, multi-granular               Onomatopoeias cycle.
approach outlined by Lartillot, Cereghetti, Eliard &                  Finally     we    investigated    various   automated
Grandjean [20]: ‘For each instant in the piece, novelty is         segmentation methods for Unsound Objects. We found
assessed by first determining the temporal scale of the            the multi-granular approach outlined by Lartillot et al,
preceding homogeneous part as well as the degree of                using MFCCs, was a very efficient and salient
contrast between that previous part and what just comes            segmentation strategy for music structured predominantly
next. The idea is to estimate the temporal scale of the            according to timbre (as opposed to pitch or rhythm).
previous ending segment as well as the contrastive                 Further, the ‘Contrast’ parameter is effective in
change before and after the ending of the segment. The             determining the granularity of segmentation – short
novelty value is then represented as a combination of the          events to long sections.
temporal scale and the amount of contrast’.
  Using this multi-granular approach, the following                Acknowledgments
MIRToolbox command yields the novelty curve shown in               Many thanks to Vincent Aubanel, who generously shared
figure 10 and the segmented audio given in figure 11:              his rhythmogram code, and to Olivier Lartillot and Petri
                                                                   Toiviainen for making their MIRToolbox publically
        mirsegment(a,'Novelty','MFCC','Rank',1:10,'Con
                                                                   available.
        trast', 0.6)
   Note that this method also uses the first ten Mel-                                  REFERENCES
Frequency Cepstral Coefficients (MFCCs) in order to
                                                                   [1] D. Hirst, “Connecting the Objects in Jonty
decrease computation time, and the ‘Contrast’ level is set             Harrison’s Unsound Objects.” eOREMA Journal,
at 0.6. With this ‘Contrast’ value there are 8 segments                Vol 1, April, 2013.
identified in figure 11. These segments correlate quite
                                                                       http://www.orema.dmu.ac.uk/?q=eorem
well with the 4 sections shown in Figure 9 in the
                                                                       a_journal
following way : Section 1 (segments 1-3); Section 2
(segments 4-5); Section 3 (segments 6-7); and Section 4            [2] C. Cannam, C. Landone, and M. Sandler, “Sonic
(segment 8).                                                           Visualiser: An Open Source Application for
   It is also possible to vary the ‘Contrast’ parameter to             Viewing, Analysing, and Annotating Music Audio
                                                                       Files.” In Proceedings of the ACM Multimedia 2010
segment on a shorter-term or longer-term event basis –
                                                                       International Conference, pp. 1467-1468.
using the same novelty curve. ‘Contrast’ is defined as: ‘A
given local maximum will be considered as a peak if the            [3] K. Jensen, Timbre Models of Musical Sounds, Ph.D.
difference of amplitude with respect to both the previous              dissertation, University of Copenhagen, Rapport Nr.
and successive local minima (when they exist) is higher                99/7, 1999.
than the threshold value specified’.                               [4] D. Hirst, “Determining Sonic Activity In
   For example, by halving the ‘Contrast’ value to 0.3                 Electroacoustic Music.” In Harmony Proceedings of
(Fig. 12), six additional peaks in the novelty curve are               the Australasian Computer Music Conference 2014.
                                                             189
    Hosted by The Faculty of the Victorian College of               Listening Talker, LISTA, Project                (See
    the Arts (VCA) and the Melbourne Conservatorium                 http://listening-talker.org/ )
    of Music (MCM). 9 – 13 July 2014. pp 57-60.
                                                                [14] A. Woloshyn, Wallace Berry’s Structural Processes
[5] D. Hirst, “The Use of Rhythmograms in the                        and Electroacoustic Music: A Case study analysis of
    Analysis of Electro-acoustic Music, with                         Robert Normandeau’s “Onomatopoeias” cycle.
    Application to Normandeau’s Onomatopoeias                        eContact! 13(3) 2010.
    Cycle.” Proceedings of the International Computer
                                                                    http://cec.sonus.ca/econtact/13_3/w
    Music Conference 2014. Athens, Greece, 14-20
                                                                    oloshyn_onomatopoeias.html
    Sept, 2014. pp 248-253.
                                                                [15] D. Ogborn, Interview with Robert Normandeau.
[6] D. Hirst, A Cognitive Framework for the Analysis of
                                                                     eContact! 11(2) 2009.
    Acousmatic Music: Analysing Wind Chimes by
    Denis Smalley         VDM Verlag Dr. Muller                     http://cec.sonus.ca/econtact/11_2/n
    Aktiengesellschaft & Co. KG. Saarbrücken, 2008.                 ormandeauro_ogborn.html
[7] N. Todd, The auditory “Primal Sketch”: A                    [16] R. Normandeau, Spleen. On music CD Tangram.
    multiscale model of rhythmic grouping, Journal of                Empreintes DIGITALes, Montréal (Québec), 1994,
    New Music Research, 23: 1, 25-70. 1994.                          IMED-9419/20-CD.
[8] N. Todd, and G. Brown, Visualization of Rhythm,             [17] O. Lartillot and P. Toiviainen, “A MATLAB
    Time and Metre. Artificial Intelligence Review 10:               Toolbox For Musical Feature Extraction From
    253-273. 1996.                                                   Audio” in Proc. of the 10th Int. Conference on
                                                                     Digital Audio Effects (DAFx-07), Bordeaux, France,
[9] D. Marr, Vision. Freeman. New York. 1982.
                                                                     September 10-15, 2007. Pp DAFX 1-8.
[10] R. Meddis,    Simulation    of     Auditory-Neural
                                                                [18] J. Foote and M. Cooper, "Media Segmentation
     Transduction: Further Studies. J. Acoust. Soc. Am.
                                                                     using Self-Similarity Decomposition," in Proc. SPIE
     83(3): 1056-1063. 1988.
                                                                     Storage and Retrieval for Multimedia Databases,
[11] A. Silcock, Real-Time ‘Rhythmogram’ Display.                    Vol. 5021, pp. 167–75, January 2003, San Jose,
     Report submitted in partial fulfillment of the                  California.
     requirement for the degree of Master of Computing
                                                                [19] J. Harrison, Unsound Objects. On Articles indéfinis.
     with Honours in Computer Science, Dept. of
                                                                     IMED 9627. emprintes DIGITALes, Audio CD,
     Computer Science, University of Sheffield. 2012.
                                                                     1996.
[12] F. Lerdahl, and R. Jackendoff, A Generative Theory
                                                                [20] O. Lartillot,  D. Cereghetti,  K. Eliard,     and
     of Tonal Music. Cambridge, Mass. MIT Press. 1983.
                                                                     D. Grandjean, “A simple, high-yield method for
[13] G. Brown, and V. Aubanel,  Rhythmogram                          assess- ing structural novelty”, International
     MATLAB code written and adapted for the                         Conference on Music and Emotion, Jyväskylä, 2013.
                                                          190
          THE COGNITIVE DIMENSIONS OF MUSIC NOTATIONS
                                                               Chris Nash
                                       Department of Computer Science and Creative Technology,
                                       University of the West of England, Bristol, United Kingdom
                                                       [email protected]
                                                                          191
easy or quick to code, given the formal rules of the                  tion with notation in the field of programming, breaking
notation. [1] Different languages (and dialects) offer                different factors of the software designer’s user experi-
distinctions in syntax and semantics to facilitate different          ence into cognitive dimensions that separately focus on
users and uses. For example: BASIC is designed using                  affordances of the notation, but which collectively help to
simple English keywords to be easily comprehended by                  paint a broad picture of the user experience involved with
beginners (at the expense of structure); Assembler more               editing code and crafting interactive software systems.
directly exposes low-level workings of hardware (at the                  The definitions of each dimension (see Section 4) are
expense of human-readability); and object-oriented                    borne from research in cognitive science, but shaped to
languages, like Java and C++, are designed around                     operationalise the framework as a practical analysis
creating modular systems and abstract data models that                tool for use by interaction designers, researchers, and
map onto user ontologies to enable notation of both low-              language architects. [7] It is intended that each dimension
and high-level concepts. As music notation similarly                  describe a separate factor in the usability of a notation,
seeks to support beginners, instrument affordances, and               offering properties of granularity (continuous scale;
flexible levels of abstract representation, it is instructive         high/low), orthogonality (independent from other dimen-
to analyse usability factors in notations for programming.            sions), polarity (not good or bad, only more or less desir-
   Beyond the format of notation, editing tools also im-              able in a given context), and applicability (broader rele-
pact the usability of a notation, and although text-based             vance to any notations).
notations can be separated from code editors, other                      In practice, these properties cannot always be met.
programming paradigms are more integrated with the                    [1, 7] Interactions between dimensions are evident, with
user experience of the development environment. For                   either concomitant or inverse relationships. For example,
example, visual programming languages (VPLs), such as                 low viscosity (~ ease of changing data) contributes to
Max/MSP, are manipulated through a graphical user                     provisionality (~ ease of experimentation); whereas,
interface, the usability of which impacts how users                   higher visibility (~ ease of viewing) may reduce hidden
perceive the language and its capabilities. Other coders              dependencies (~ invisible relationships). Moreover, some
develop using an integrated development environment                   dimensions are value-laden; intuitively it may be difficult
(IDE), offering unified platform for writing, building,               to see how error proneness, hard mental operations, and
running and debugging code. The integration of such                   hidden dependencies are desirable. However, knowledge
tools allows code edits to be quickly tested and evaluated,           of these relationships can be useful in solving usability
accelerating the feedback cycle and thus enabling rapid               issues, where a solution to one dimension can be ad-
application development, in turn facilitating experimenta-            dressed through a design manœuvre targeted at another.
tion and ideation. [6] Thus, any approach for analysing                  The exact set of cognitive dimensions is not fixed, and
notation should likewise address factors in the UI.                   various proposals for new dimensions, designed to
   In music, similar considerations can be made of the                capture aspects of a notation or user experience beyond
design of interactive modes supported by tools to                     the original framework, have been forwarded – many
manipulate notations – be that pencil and paper, ink and              arising from its expanded use in other fields in and
printer, or mouse and computer screen. Score notation                 around HCI (non-programming interaction, tangibles,
supports composers in creating music, performers in                   computer music). New dimensions should be measured
interpreting it, scholars in analysing it, and learners in            against the aforementioned requirements, but their value
understanding it. In each case, practitioners use different           is most effectively gauged by how much they reveal
techniques and tools to interact with the encapsulated                about the interaction context in question, and arguably
music. Moreover, while music plays a functional role in               the greatest contribution of the framework is that it
many aspects of culture, it is also about personal, creative          provides a vocabulary and structure for discussing and
expression, and thus it is important to look at how the               analysing notation from multiple perspectives.
development of musical ideas is shaped by the design of                  As an HCI tool (and in contrast to other usability
notations. To consider this, the following section uses the           methodologies), it allows both broad and detailed analy-
analogue of programming to adapt an established analysis              sis of human factors in a notation or user interface, adapt-
framework that might be used to reveal limitations, influ-            able to different use cases and audiences. By considering
ences and opportunities in music notations and interfaces.            each cognitive dimension in the context of a specific
                                                                      system, designers and evaluators can assess how the
          3. A USABILITY FRAMEWORK                                    notation fits their user or activity type, whether that’s
The Cognitive Dimensions of Notations [1] is a usability              making end-user systems easier to use [5] or making
framework originally developed by Thomas R. G. Green                  musical interaction more rewarding by increasing chal-
and Marian Petre, to explore the psychology of interac-               lenge. [8, 9]
                                                                192
   For a detailed discussion of the background and defini-          the final musical form (e.g. expression and prosody of
tion of dimensions in the original framework, see [1]. For          performance) may not be visually explicit in the musical
further publications on the subject, see the framework’s            score (see closeness of mapping).
resource site and associated bibliography. 1
                                                                       MAX/MSP: As a visual programming language (VPL),
                                                                    visibility is a key dimension of Max, which explicitly
      4. DIMENSIONS OF MUSIC NOTATION
                                                                    represents the flow of audio and musical data. As in
In this section, sixteen core dimensions of the framework,          many programming languages, the visibility of process
adapted for a musical context, are detailed and discussed           (code/data-flow) is prioritised over musical events (data).
in the context of three common musical interaction sce-             In Max, many elements of a system are not visualised,
narios. To evaluate both formal and informal music nota-            such as the internal state of most objects (e.g. default or
tion, each dimension is respectively reviewed in the con-           current values). There is also no inherent linear / serial
text of the musical score and sketch (SCORE). The inter-            representation of musical time, making it difficult to
section of musical expression and programming is then               sequence past or future events or behaviour. As such,
similarly explored in the context of the Max audio syn-             Max best suits generative and reactive (live) applications.
thesis environment (MAX/MSP). Lastly, the framework is
                                                                       DAW: Like most end-user software, DAWs offer a
used to review the user interfaces and experiences offered
by mainstream end-user systems, through an analysis of              graphical user interface (GUI) that is inherently visual.
digital audio workstation (DAW) and sequencer software              However, different sub-devices (views) reveal or hide
(DAW). In addition to a description of the dimension, each          different properties of the music; no screen provides a
is introduced with a simple question designed to encapsu-           comprehensive or primary notation. Notably, the arrange
late the definition in a form that can be used to capture           window is the only window designed to provide an over-
feedback from end-users (e.g. a user survey                         view of the whole piece, but filters low-level detail (e.g.
[3, 8, 10, 11]).                                                    notes), which must be edited through other interfaces
                                                                    (score, piano roll, data list). As a result musical data is
4.1 Visibility                                                      dispersed through the UI and can be difficult to find,
                                                                    often involving navigating and scrolling through win-
“How easy is it to view and find elements or parts of the
                                                                    dows and views with the mouse. Arguably the primary
music during editing?”
                                                                    and most expressive interaction medium for the sequenc-
   This dimension assesses how much of the musical                  er is inherently non-visible: performance capture
work is visualised in the notation or UI, as well as how            (MIDI/audio recording).
easy it is to search and locate specific elements. While
hiding data will make it difficult to find, showing too             4.2 Juxtaposability
much data can also slow the search. Pages and screens               “How easy is it to compare elements within the music?”
limit the amount of space available for displaying data,
requiring a careful balance of visual detail and coverage.              Related to visibility, this dimension assesses how no-
                                                                    tated music can be compared against other data. Pages
   [Related dimensions: juxtaposability, abstraction                and moveable windows allow side-by-side comparison,
management,        hidden     dependencies,     concise-            albeit at some cost to visibility. How clearly elements and
ness/diffuseness, closeness of mapping, role expressive-            their purpose are represented will also affect how easy it
ness.]                                                              is to compare notated passages (see role expressiveness).
  SCORE:   In sheet music, all notated elements are visible         Music systems may also provide tools for non-visual
on the page; there is no feature to dynamically hide notat-         comparisons – e.g. sound (see progressive evaluation).
ed elements, beyond using separate sheets. However,                    [Related      dimensions:      visibility,    conscise-
music is hidden on other pages, where page turns also               ness/diffuseness, role expressiveness, progressive evalua-
present                                                             tion]
challenges for typesetter or performer, if phrases continue
                                                                       SCORE: Pages allow side-by-side comparison of ele-
over a join. This can be accounted for in layout, with
forethought, but this increases the premature commit-               ments, and the formal rules for encapsulating music make
ment. Things are easier for the composer, as a draft musi-          visual inspection an effective tool for assessing similarity
cal sketch need not cater for the performer, and pages can          (rhythmic patterns, melodic contour, etc.). However,
be laid side-by-side (see juxtaposability). Some aspects of         some musical properties are distinguished more subtly in
                                                                    the visual domain (e.g. harmony, key, transposed parts –
  1                                                                 see hidden dependencies), requiring musicianship as well
   http://www.cl.cam.ac.uk/~afb21/Cogn
                                                                    as notational literacy to enable effective comparison.
itiveDimensions/
                                                              193
   MAX/MSP: Max’s windowed system allows side-by-                    However, patch execution in Max is also affected by the
side comparison, so long as abstraction (sub-patching) is            relative placement and spatial relationship of objects (e.g.
applied effectively. Groups of objects can be dragged                right-to-left processing of outlets), which is not visualised
next to each other, but this becomes cumbersome as the               explicitly and can lead to unexpected patch behaviour
patch grows and objects are intricately woven and linked             that confuses users. While relations between objects are
to surrounding objects (see viscosity and premature                  shown, its specific functional purpose is not explicit and
commitment). Broad visually similarity and functional                the object’s current state or value is hidden. For example,
similarity may not always align (see role expressiveness).           default values specified as arguments can be replaced by
                                                                     messages, but there is no visual indication the value of
   DAW: As in Max, windowed systems allow side-by-
                                                                     the object has changed from its displayed default value.
side comparisons, though sizing, scrubbing, and scrolling
                                                                          Use of sub-patching can also hide functionality,
can be cumbersome in the face of many windows, a
                                                                     though this is a common trade-off with the additional
common issue in traditional linear sequencers [4, 12].
                                                                     expressive power offered by abstraction mechanisms.
Most visualisations of musical elements are easy to com-
                                                                     Moreover, as a data-flow environment (and in contrast to
pare to similar properties of other tracks, bars, etc., and
                                                                     imperative programming, as in C++), musical time and
generalised
                                                                     the sequence of events are not visually explicit, hiding
representations (track automation envelopes) also offer a
                                                                     causal and timing relationships between musical ele-
basis for comparison across different musical properties.
                                                                     ments.
4.3 Hidden Dependencies                                                 DAW: The variety of different views and UIs designed
“How explicit are the relationships between related ele-             for different purposes and perspectives can lead to a large
ments in the notation?”                                              number of hidden dependencies within DAWs. [3]
                                                                     For example, across the different screens and settings
   This definition assesses to what extent the relation-             there are dozens of variables that impact the final volume
ships and dependencies (causal or ontological) between               of an individual note, and often no explicit visual link
elements in the music are clear in the notation. Showing             between them. Similarly, the routing of audio signals
dependencies can improve visibility, but there is often a            through a DAW is usually not visually illustrated, but
trade-off with editing viscosity. For example, in pro-               dependent on the values of (potentially hidden) drop
gramming, textual source code (e.g. C/C++) can be easily             menus. Some DAWs have attempted to address this:
edited, but the relationships between sections of code,              Tracktion enforces a left-to-right signal flow where a
functions, and variables are not explicitly shown. How-              track’s inputs, inserts, send effects, outputs, and other
ever, in visual programming languages (VPLs), objects                processes are aligned in sequence (in a row) within the
and variables are linked using arcs, making their func-              tracks of its arrange screen; whereas Reason takes a
tional connection visually explicit, but making it harder            skeuromorphic approach using visual metaphor to the
to edit, once woven into the rest of the code. [1, 13]               studio, enabling users to inspect and manipulate the wired
   [Related dimensions: visibility, closeness of mapping,            connections on the back of virtual hardware devices.
role expressiveness, viscosity, conciseness/diffuseness]
                                                                     4.4 Hard Mental Operations
  SCORE:   The visibility of the score ensures no actual da-
ta is hidden, except on separate pages, though the musical           “When writing music, are there difficult things to work
relationship between notated elements is not always                  out in your head?”
explicit. Some elements are visually linked (e.g. slurs and             This dimension assesses the cognitive load placed on
phrasing) and there are other visual cues that events are            users. While this is one of the few dimensions with a
related, as in the use of beams or stems to respectively             prescribed polarity (to be avoided in a user experience),
bridge rhythmic or harmonic relationships. However,                  musical immersion, motivation, and enjoyment is predi-
musical events are sensitive to context, as with dynamic             cated on providing a rewarding challenge commensurate
marks, previous performance directions, and key changes              with ability, such that music may be one of the few fields
– though a visual link between each individual note and              where this dimension is to some degree desirable.
the markings that affect its performance would be ineffi-
                                                                        [Related dimensions: consistency, hidden dependen-
cient to notate explicitly (increasing the diffuseness).
                                                                     cies]
  MAX/MSP:    A key attribute of all VPLs; the graphical
                                                                       SCORE:   Formal music notation carries a high literacy
connection of elements using patch cables explicitly
                                                                     threshold, making the score inaccessible to untrained or
identifies dependencies between Max objects, and help to
                                                                     novice users. Moreover, aspects of the score also require
show signal flow and the wider architecture of a patch.
                                                                     experienced musicians to solve problems in their head,
                                                               194
such as applying key signature, deducing fingering, etc.                 SCORE: Musical feedback is available through manual-
(see hidden dependencies). By not notating these                      ly playing, sight-reading, and auditioning the notated
elements, scores can be more concise, as well as less                 music using an instrument. Material can be evaluated
prescriptive for interpretation by performers. Interaction            through performance (possibly requiring transposition)
with music notation also draws heavily on rote learning               on various instruments – commonly, a piano. Crucially,
and deliberate practice to develop unconscious skills and             the piece needn’t be complete (or ‘correct’) to audition
reflexive playing techniques that would be less efficiently           individual phrases or parts. Moreover, lo-fidelity musical
or fluidly performed if mediated through notation.                    scores (sketches) allow unfinished, informal notation of
                                                                      ideas that can still be interpreted by the composer. There
   MAX/MSP: While arithmetic and computation tasks can
                                                                      may, however, be a disparity between notated forms and
be offloaded to the Max, some aspects of patch behaviour
                                                                      a musical performance, where performers may add their
must be carefully approached. Execution order and causal
                                                                      own interpretations to the notes on the page (individual
relationships are not visually explicit in a Max patch;
                                                                      prosody, articulation, rubato, etc.). Simulation of material
users must comprehend the flow of processing to under-
                                                                      on a different instrument also relies on the composer’s
stand the behaviour of their program. Similarly, the lack
                                                                      knowledge of the target instrument and related technique
of a timeline makes time a more abstract concept, making
                                                                      – e.g. a piano may be more or less musically flexible, and
less process-oriented styles of music harder to create and
                                                                      offer a different timbre to the target instrument.
conceive, unless mentally simulated by the user.
                                                                        MAX/MSP:    The environment allows patches to be run at
  DAW:    Pro audio software is created with design princi-
                                                                      any time, though they must be coherent and syntactically
ples favouring usability and ease-of-use: to be accessible
                                                                      correct to evaluate the sound design or music. Good
to musicians and non-computer audiences. The various
                                                                      programming practice encourages a modular approach
sub-devices in a DAW allow users to edit data through a
                                                                      that allows sub-components and simpler configurations to
UI style suiting their background (score, mixer, piano
                                                                      be tested individually, early in development, though its
roll, MIDI instrument, etc.). However, because complexi-
                                                                      function and output might be abstracted from the final
ty is hidden from the user, there is some risk of such
                                                                      sonic intent of the code.
systems becoming less flexible and more opaque; made
of black boxes supporting established artistic workflows                DAW:    The timeline, mixer, playback and track controls
(see closeness of mapping, premature commitment). The                 (e.g. mute, solo) enable the user flexible control of listen-
apparent disjunction between usability and the virtuosity             ing back to musical data and auditioning individual edits.
musicians embrace in other aspects of their practice                  A piece can be auditioned as it is built up, track-by-track,
(performance, score literacy, composition) may suggest                bar-by-bar, or note-by-note, and there is no requirement
that such users would accept the cost of developing skill,            that the ‘solution’ or notated form be musically correct or
when more flexible approaches to musical creativity is                coherent to be heard. The rigid UI prevents the entry of
the reward, and thus design heuristics based on virtuosity            non-sensical data, and informal or ambiguous directions
rather than usability may be more apt. [9, 14]                        (see secondary notation) cannot be auditioned. For digi-
                                                                      tally-produced music, the sound output offers an exact
4.5 Progressive Evaluation (Audibility / Liveness)                    representation of the music notated in the UI.
“How easy is it to stop and check your progress during                      Sequencers designed to accelerate the edit-audition
editing?”                                                             cycle enable a higher level of liveness in the user
                                                                      experience of notation-mediated digital music systems, as
   This dimension details how easy it is to gain domain               evidenced by loop- and pattern-based sequencer software
feedback on the notated work during editing. How                      such as Ableton Live and most soundtrackers [11], which
complete must the work be before it can be executed?                  focus editing on shorter excerpts of music, shortening
     In music, this is defined by what facilities are availa-         the feedback cycle. This contrasts the unbroken linear
ble to audition the sound or performance of the music.                timelines of traditional sequencers, where (beyond the
‘Liveness’, another concept adapted from programming,                 literally live experience of recording), interaction styles
defines the immediacy and richness of domain feedback                 for editing and arranging parts offer lower liveness.
available in the manipulation of notation [15, 16], and is a
key factor in the user’s feeling of immersion in the crea-            4.6 Conciseness / Diffuseness
tive process and domains such as music. [11,17]
                                                                      “How concise is the notation? What is the balance
   [Related dimensions: provisionality, premature com-                between detail and overview?”
mitment, hard mental operations]
                                                                        This dimension assesses the use of space in a notation.
                                                                      Both pages and screens have limited space, and both the
                                                                195
visibility and viscosity of a notation suffer when data              formed those ideas must be. Accordingly, it is a critical
escapes from focus. Legibility may also suffer if the                factor in a musical system’s support for sketching, idea-
notation is simply shrunk or packed tightly, such that               tion, and exploratory creativity. [3, 5, 11, 16] In digital
conciseness must normally be balanced with careful use               systems, an ‘undo’ facility significantly contributes
of abstraction mechanisms. In music, composers need to               to provisionality, allowing inputs and edits (‘what if’
be able to access every detail of a piece, but also able to          scenarios) to be trialled and reversed, reducing premature
get a sense of the ‘big picture’. [3, 12]                            commitment to a particular approach [1] – reducing the
                                                                     risk of trying new ideas. The dimension is closely related
  [Related dimensions: visibility, juxtaposability, hidden
                                                                     to viscosity and progressive evaluation, where the ease
dependencies, abstraction management, consistency]
                                                                     and flexibility of editing and auditioning similarly facili-
  SCORE:    The score has evolved to provide a concise               tates exploring new ideas. Secondary notation also offers
representation of music. Unlike digital notations, no                the opportunity to make incomplete or informal remarks,
abstractions or sub-views are available to hide detail; all          but in a non-executable form that can’t be auditioned.
elements are always visible, requiring economical use of
                                                                         [Related dimensions: premature commitment, viscosi-
space. Time is represented using a pseudo-linear scale,
                                                                     ty, progressive evaluation, secondary notation]
where notes are positioned within the bar to reflect rela-
tive position in a piece, but bar sizes are compressed such            SCORE:   In a musical sketch, the affordances of paper
that sparse phrases consume less space. Musical time and             and pencil support a powerful and flexible medium for
page position are further decoupled through the symbolic             capturing part-formed ideas. [5, 18] Pencil can be easily
representation of note duration, such that slow passages             and quickly erased, facilitating experimentation and idea-
(e.g. of semi-breves) do not consume excessive space, but            tion. By contrast, the formality of the typeset, printed ink
fast passages (e.g. of demi-semi quavers) are expanded to            manuscript is less flexible and more permanent, used
show the detail more clearly. This symbolic encoding of              only to finalise a composition for archiving or communi-
time, however, does lower the closeness of mapping,                  cation (e.g. to performers). These two instances of score
increasing the onus on literacy (virtuosity).                        notation compliment each other in an established
                                                                     ecosystem that facilitates both composition (creativity)
   MAX/MSP: The layout and density of a Max patch is
                                                                     and performance (production) (cf. [19]).
flexible, though readability suffers when objects are
densely packed together or connecting patchcords                        MAX/MSP: The visual drag-&-drop, interactive debug-
obscure each other. When complex patches grow outside                ging environment of Max facilitates its use as a rapid
the confines of a window, visibility suffers and mouse-              prototyping tool, useful in the exploratory design of new
based interaction can be cumbersome. Abstraction mech-               audio processing tools and synthesis techniques [13] –
anisms such as sub-patching are critical in managing                 though some more involved musical constructs or expres-
complex systems and avoiding sprawling patches, but                  sions can be harder to develop or articulate quickly, re-
trade diffuseness over screen space for diffuseness over             ducing provisionality and ideation. Conversely, as a pro-
separate, possibly hidden windows.                                   totyping tool, Max’s focus on experimentation and early
                                                                     stage creativity comes at the expense of subsequent stag-
   DAW: The variety of notations and views in DAWs of-
                                                                     es of the creative process (“productivity” [19]): finalisa-
fer a varied level of conciseness. The arrange view sacri-
                                                                     tion, refinement, and continued development of designs
fices visibility of data to accommodate a broader over-
                                                                     (e.g. for consumption by end-users, non-programmers,
view of a piece in the UI. Part editors, like the score and
                                                                     and other musicians) is normally conducted using other
piano roll interfaces, offer more detail (in a manner simi-
                                                                     development tools (e.g. C/C++).
lar to the traditional score), but only partial views of the
entire work. More generally, the lack of a comprehensive                DAW: Like other authoring tools, DAWs offer multiple
principle notation or interface means that information is            ways of quickly adding, editing and deleting elements in
diffused over different views within the program. Many               the document (i.e. musical piece). Moreover, the presence
DAWs do little to optimise window management, naviga-                of ‘undo’ functionality makes it easy to backtrack
tion, or searching, compounding interaction issues.                  actions, reducing the risk of experimenting with new
                                                                     ideas, encouraging ideation [1]. The primary mode of
4.7 Provisionality                                                   input – digital audio or MIDI performance capture – in
“Is it possible to sketch things out and play with ideas             combination with practically unlimited storage (length,
without being too precise about the exact result?”                   tracks, etc.) represents an improvement in provisionality
                                                                     over historic recording techniques (e.g. tape). Users can
   This dimension assesses how easy it is to experiment
                                                                     also address issues in live recordings using advanced
with new ideas through the notation or UI, and how fully
                                                                     overdub tools, without recourse to re-recording entire
                                                               196
performances. Offline editing, through part editors like            other music notations (e.g. the musical sketch and score
score or piano roll, allows experimentation with different          [5]), their omission in such a visual medium is surprising.
ideas, though such interfaces are not always optimised for
                                                                       DAW: Despite the proliferation of different notational
the rapid entry, editing and auditioning of new material to
                                                                    styles in DAWs, each UI is rigidly structured to fulfil a
support creative exploration of musical ideas. [11]
                                                                    defined purpose and offer specific tools for editing the
                                                                    underlying data. Limited provisions for annotations are
4.8 Secondary Notation
                                                                    provided by way of labelling and colour-coding parts and
“How easy is it to make informal notes to capture ideas             tracks, and free text is often supported for meta-data, but
outside the formal rules of the notation?”                          few mechanisms are provided for flexibly annotating the
   This dimension evaluates a system’s provision for re-            music in any of the sub-notations or views, beyond those
cording information beyond the formal constraints of the            forms formally recognised by the program.
notation. As informal notation, data is typically not
executable by computer or performer, and may only be                4.9 Consistency
related to the encapsulated piece / performance indirectly.         “Where aspects of the notation mean similar things, is
Decoupled from the formal rules of expression in the                the similarity clear in the way they appear?”
notation, secondary notations often allow users to make
                                                                       This dimension defines how coherent and consistent
freeform notes to support their edit process, though
                                                                    the methods of representing elements in a notation or UI
flexibly designed facilities may be used for a variety of
                                                                    are. Consistency facilitates the learning of a system (see
purposes – including evaluation (peer feedback), working
                                                                    virtuosity), as users used to a style of presentation can
out problems, highlighting relationships in the notation,
                                                                    apply knowledge learnt in one area to understand others.
sketching rough high-level structure, aesthetic decoration,
                                                                    However, consistency may also be sacrificed to improve
to-do lists, incomplete ideas, etc. In programming, code
                                                                    conciseness, visibility, or role expressiveness.
commenting is used to annotate code with useful labels,
instructions, explanations, ASCII art, etc., helping to                [Related dimensions: conciseness, visibility, virtuosity,
make the code more readable, but also as a form of                  role expressiveness, abstraction management]
communication between coders. As such, secondary
                                                                       SCORE: In sheet music, notated passages that are simi-
notations should be designed to be as flexible as possible,
                                                                    lar musically share similar visual cues, e.g. melodic con-
to allow users to appropriate them for their own needs.
                                                                    tour, repeated passages, etc. Formal rules applied consist-
   [Related dimensions: provisionality, hard mental op-             ently likewise ensure recognisable and learnable conven-
erations, hidden dependencies, role expressiveness]                 tions. However, compromises are made for conciseness,
   SCORE: The expressive freedom of pencil and pen                  and to optimise the presentation of common expressions,
marks on paper allow musical scores to be annotated with            at the expense of readability in less canonical works. For
any additional information, such as personal notes, deco-           example, the symbolic representation of note rhythm in a
ration, as well as irregular performance instructions.              passage completely alters if offset within the bar (e.g.
Formal notation places more constraints on what is                  moved by a quaver). Similarly, the representation of pitch
representable, though written language can be freely used           depends on key; an identical phrase requires accidentals
in performance directions. The human interpretation of              following a change of key signature. Both scenarios
scores enables a further degree of flexibility in applying          present limited issues in common practice music, but the
and developing new terminology, such that informal                  inconsistency makes the notation harder to learn and
notes that break from standard semantics may still be               understand, and the difficulty of using it outside its
executable. Performers can also add their own notes to              intended purpose encourages conformity, discouraging
manuscripts to guide their own interpretation of the piece.         experimentation and creativity. Moreover, in digital use
                                                                    (notably MIDI sequencers), such inflexibility markedly
  MAX/MSP:    Like other programming tools, code com-               reduces the usability of score notation, where systems are
ments are an important part of developing and maintain-             unable to unpick the expressive prosody in a captured
ing Max patches. Max’s visual medium supports annota-               live performance to display a coherent visual score.
tions using free text (comment boxes), shaded areas, and               MAX/MSP: By design, programming languages offer
imported images (bitmaps), used to explain workings,                diverse paths to produce similar code functionality.
usage, or as decoration. However, drawing facilities are            Textual languages are based on rigid, carefully designed
very limited in comparison to pencil and paper, and even            formal grammars that ensure basic low-level consistency
digital graphics, with no provision for freehand sketching          among programming primitives, also enabling many
or drawing lines, arrows, or shapes (other than rectan-             syntactic errors to be identified during compilation.
gles). Given the proven benefits of such affordances in             Max’s collection of objects is less formally designed and,
                                                              197
as the accumulation of several developer’s efforts (and                manuscripts are intentionally more rigid, but performers
coding styles), less consistent. Inconsistencies exist in              can still annotate their copy with alternative instructions.
many areas, including object-naming schemes, inlet and
                                                                          MAX/MSP: Simple changes to values and local objects
outlet conventions, processing behaviour, message
                                                                       are straightforward in Max. However, as patches grow
handling, audio quality (and level), and configuration
                                                                       and the interconnectedness of objects increases, Max
methods. These nuances produce unanticipated code
                                                                       suffers from knock-on viscosity [1], where one change
behaviour that increases the learning curve for novices.
                                                                       requires further edits to restore patch integrity. For exam-
Objects behave like self-contained programs or plugins;
                                                                       ple, deleting, editing, or replacing objects removes all
black boxes that have to be mastered individually.
                                                                       cords to other objects. Increased viscosity is a common
    DAW: The added flexibility in the visualisation of data,           trade-off in tools designed to avoid hidden dependencies,
in the various views afforded by DAWs inevitably comes                 often seen in data-flow and visual programming
at the cost of consistency of representation throughout the            languages like Max. As a graphical notation, changes to a
program. For example, volume might variously be repre-                 patch often require the layout of a patch to be reworked
sented as a MIDI value (0-127), automation value (0-1),                to make room for object insertions, and to maintain
gain (dBFS, e.g. -96dB to 0dB for 16-bit audio), or using              readability. In text-based coding environments, such
graphics (colour, bar size, rotary knob angle). The trend              housekeeping is simplified by the inherent serialisation of
towards skeuromorphic visual metaphors to electronic                   code, but in VPLs like Max, leads to increased viscosity.
studio equipment similarly encourages inconsistencies in
                                                                          DAW: As with provisionality, the level of viscosity in
representation, drawing on the conventions of previously
                                                                       DAW interaction varies between the interfaces and inter-
separate, loosely connected hardware devices. Moreover,
                                                                       action modes of the sequencer. By itself, a tape recorder
while the advent of third-party plugins brings great
                                                                       metaphor of recording a live performance makes it easy
advantages and creative flexibility, inconsistencies in
                                                                       to erase and re-record a take, but harder to edit recorded
control, representation, terminology, and interaction
                                                                       data. Audio data can be processed (e.g. EQ, mixing, FX,
create usability issues and a fragmented user experience
                                                                       splicing, etc.), but musical content (e.g. individual notes
that is difficult to integrate with the host application.
                                                                       or harmonies) is not easily addressed or manipulated.
4.10 Viscosity                                                         Recorded MIDI data is easier to edit, though visual repre-
                                                                       sentations (e.g. score – see consistency) and interaction
“Is it easy to go back and make changes to the music?”
                                                                       styles can be cumbersome and unwieldy for anything but
   This dimension defines how easy it is to edit or change             simple edits. [11, 12]
a notation, once data has been entered. A common exam-
ple is knock-on viscosity, where making a simple edit to               4.11 Role Expressiveness
the notation requires further edits to restore data integrity.
                                                                       “Is it easy to see what each part is for, in the overall
High viscosity prevents or discourages alterations, forcing
                                                                       format of the notation?”
users to work in a prescribed, pre-planned order (see
premature commitment); low viscosity simplifies and                       This dimension evaluates how well the role or purpose
encourages making changes, reducing the investment                     of individual elements is represented in the overall
associated with trialling new ideas (see provisionality).              scheme of the notation or UI. Different elements may not
Being able to easily explore and revisit ideas (ideation) is           be visually indistinct, or their function may be unclear in
a key factor in supporting creativity [6, 19], requiring               the way they are presented. For example, English lan-
creative systems engender low viscosity.                               guage keywords in a menu or programming language can
                                                                       be used to express their function, whereas cryptic sym-
   [Related dimensions: provisionality, premature com-                 bols or icons may need to be learnt. Alternatively, the
mitment, progressive evaluation]
                                                                       visual design of GUI may impose a consistent aesthetic
   SCORE: The provisionality of pencil marks simplifies                or
the alteration, erasure and overwriting of notes and pas-              layout that fails to capture the diverse functionality en-
sages in a musical sketch. If more drastic changes are                 capsulated, or the relationship to other elements of the UI
required, the reduced emphasis on neatness and third-                  (see hidden dependencies and closeness of mapping).
party
                                                                          [Related dimensions: visibility, hidden dependencies,
readability allows the composer to strike out larger
                                                                       closeness of mapping]
sections. Inserting new material is harder, but composers
can similarly sketch the inserted passage where there is                  SCORE: While some aspects of the score may be in-
space (or on a new sheet) and note the insertion. Final                ferred by listening to the music (such as a general sense
                                                                       of pitch and rhythm), most involve learning syntax and
                                                                 198
rote practice. Similarly, while some signs offer more                 and the provisionality of the musical sketch allows some
expressive visual cues to role (crescendo and diminuendo              flexibility with development of musical phrases and bars.
hairpins; tremolo marks), many do not – clefs, acciden-               Multiple approaches to composition are possible:
tals, key signatures, note shapes, ornaments, and foreign             horizontal (part-by-part), vertical (all parts at once, start
terms symbolise complex musical concepts that require                 to finish), bottom up (bar-by-bar), top down (musical
tuition. Once learnt, however, the symbols facilitate the             form). Historically, the literacy and musical experience of
rapid comprehension of notated music – e.g. different                 composers meant that musical material was often part-
note shapes and beaming conventions provide clear dif-                formed before being committed to the page – either men-
feren-tiation between different note lengths. However,                tally, or through experimentation with instruments.
recent approaches to contemporary scores tend to exploit
                                                                         MAX/MSP: As a prototyping tool, Max supports exper-
more expressive geometric forms, rather than new sym-
                                                                      imentation with partially-formed design concepts. Often,
bol sets.
                                                                      however, audio processes will be designed with a plan or
    MAX/MSP: The role of some specialised objects, nota-              general architecture in mind; in Max, forethought with
bly user controls, is clear from their representation in              respect to abstraction (sub-patching) or layout benefits
Max. However, beyond caption and inlet/outlet configu-                development, though housekeeping may be needed
ration, Max offers little visual distinction in the represen-         retrospectively (see also viscosity). The open-ended
tation of most coding objects, which appear as text boxes.            canvas allows patches to be flexibly extended in any
Patchcords help to define the context and role of connect-            direction, and a modular approach to programming al-
ed objects, and visual distinction is made between audio              lows piecewise development of complex systems.
and message types (though not between int, float, list, or
                                                                        DAW:    Musical parts and audio segments can be easily
bang subtypes) – but, despite the unidirectional flow of
                                                                      inserted, moved, and copied in the arrange window,
data, flow direction is not depicted (e.g. using arrows).
                                                                      though complex phrases with overlapping tracks and
  DAW:    Many aspects of DAW UIs rely on a degree of                 automation can be difficult to split and re-sequence.
familiarity with studio equipment and musical practice.               Furthermore, the unified linear timeline and tape
However, the graphical user interfaces of most packages               recorder metaphor encourages a linear workflow. [12] In
make prominent use of expressive icons and detailed                   modelling studio workflows, DAWs can be seen as
graphics to indicate the function of controls. Visual                 transcription tools, rather than environments for explora-
metaphor and skeuromorphisms are commonly used to                     tory creativity, where artists only turn to the recording
relate program controls to familiar concepts. Image                   process once a work has already taken form. [3,20]
schema and direct manipulation principles are similarly               By contrast, pattern- and loop-based sequencers (Live, FL
applied to highlight interaction affordances, in the context          Studio, tracker-style sequencers) offer a flexible non-
of both music and generic computer interaction styles.                linear approach to developing and sequencing musical
                                                                      forms, facilitating digitally-supported creativity and flow.
4.12 Premature Commitment
“Do edits have to be performed in a prescribed order,                 4.13 Error Proneness
requiring you to plan or think ahead?”                                “How easy is it to make annoying mistakes?”
   This dimension defines how flexible a notation is with                This dimension identifies whether the design of a UI
respect to workflow, and the process of developing ideas.             or notation makes the user more or less likely to make
Notations or system features that must be prepared or                 errors or mistakes. These can manifest as accidental in-
configured before use entail premature commitment.                    teractions with a program, or incoherent, unexpected
Notations with high viscosity, where it is hard to back-              musical results arising from vagueness or ambiguity in
track, also entail forward planning and commitment. In                the notation (see role expressiveness). In programming,
programming, an illustrative example is the need to                   for example, a notation is error prone if its function
declare variables and allocate memory before coding                   markedly alters upon the addition/omission/position of a
actual functionality (cf. C/C++ vs. BASIC).                           single character. Errors are broadly undesirable, but can
                                                                      lead to creative, serendipitous formulations in artistic
  [Related dimensions: provisionality, viscosity]
                                                                      expression. [21]
   SCORE: A degree of viscosity in altering page layout
                                                                        [Related dimensions: hidden dep., role expressivness]
means that some forward thinking is required to commit
musical ideas to the page, which generally proceeds left-                SCORE: In scoring, the literacy threshold means mis-
to-right, bar-by-bar. However, separate pages allow                   takes are more likely during early stages of learning.
sections and movements to be developed non-linearly,                  Aspects of consistency and hidden dependencies contrib-
                                                                199
ute to a user’s propensity to make errors. However, like            makes assumptions about the musical practices and aes-
language, fluency with the notation reduces mistakes.               thetics of its users, such that modern composers identify
Sketching, as a private medium for the composer, is also            the format as a constraint on their personal expression
tolerant of errors; they are free to misuse or invent nota-         and creativity. However, the flexibility offered by indi-
tion, which remains meaningful to them personally.                  vidual sketching techniques allows composers to invent
When scores are used for communication, mistakes have               and appropriate notation techniques for their own person-
consequences; but the impact on early creative process is           al use.
minimal.
                                                                      MAX/MSP:     The data-flow model of Max maps closely
   MAX/MSP: As a formal language, it is easy to make                to diagrammatic forms used widely in signal processing,
mistakes in Max, through the creation of ill-formed code.           with a shared legacy in electronics and circuit diagrams.
However, aspects of the Max UI make it more prone to                The inherent role of electronics in the studio, and repre-
errors in certain situations. As a graphical UI, mouse              sentation of audio as voltage, also make this an analogy
interaction is cumbersome, and Max attempts to avoid                that musicians and producers can relate to. The functional
diffuseness with compact objects, such that selecting and           and visual resemblance to generic flow charts further
connecting inlets or outlets using patchcords is awkward.           helps to make the programming environment accessible
As with the score, consistency issues and hidden depend-            to non-technical users. However, for musical applications
encies also invite mistakes relating to coding semantics.           (rather than audio processing) such as arrangement and
                                                                    composition, the abstract representation of time offers a
   DAW: Recording performances in real-time heightens
                                                                    poor closeness of mapping to familiar representations of
the likelihood of input errors, though facilities exist to
                                                                    music. Similarly, for traditional programmers used to
correct or overdub recorded data, and the occasional
                                                                    imperative programming (ordered sequences of instruc-
mistake is often acceptable for the improved flow (musi-
                                                                    tions), scripting program behaviour over time is difficult.
cal and creative) afforded by live interaction with an
instrument. As in Max, DAWs invite mistakes through                   DAW:     For its intended audience of musicians and
dependence on the mouse, where delicate pointer control             sound engineers, traditional sequencers and DAWs pro-
is required – many edits require targeting the edge of              vide a strong closeness of mapping, using visual meta-
elements (track segments, notes, etc.), and the extent of           phors and interaction paradigms based on studio process-
such hotspots may be small and visually indistinct. Prox-           es and audio hardware, to allow skills transfer. Notably,
imate and overlapping objects can be similarly difficult to         MIDI and audio recording tools focus interaction on
target.                                                             musical instruments. However, in recent years, more
                                                                    computer-oriented musicians, with greater technical liter-
4.14 Closeness of Mapping                                           acy, have begun to embrace tools that rely less on analo-
“Does the notation match how you describe the music                 gies to the recording studio and focus on the affordances
yourself?”                                                          of digital and computer music technologies – as offered
                                                                    by Ableton Live and FL Studio. Ultimately, engagement
   This dimension assesses how well a notation’s repre-             with music, as a personal experience, should be based on
sentation of the domain aligns with the user’s own mental           articulations of the music domain crafted by the user
model. In a UI, this also applies to how closely work-              themselves, which the rising level of computer literacy
flows and interaction styles fit a user’s working methods.          might enable, as end-users increasingly engage with pro-
Music perception and aesthetics are quintessentially sub-           gramming.
jective, making it difficult to encode a universally or
intuitively acceptable formalisation, so notations and              4.15 Abstraction Management
systems are built around common cultural practices. This
                                                                    “How can the notation be customised, adapted, or used
can constrain the creative expression or affordances of a
                                                                    beyond its intended use?”
notation. To mitigate this, abstraction mechanisms may
enable users to appropriate, redefine, and extend systems.             This dimension defines what facilities a system offers
                                                                    for appropriating, repurposing, or extending a notation or
  [Related dimensions: role expressiveness, abstraction
                                                                    UI. All notations present an abstract model of a domain
management, virtuosity]
                                                                    (e.g. music, software), providing a set of fixed abstrac-
  SCORE:   While the score is not an intuitive representa-          tions representing basic objects (e.g. notes, parts) and
tion that untrained users might themselves conceive or              properties (e.g. pitch, time, etc.) that enable the articula-
comp-rehend, it remains a widespread and established                tion of solutions (e.g. a piece). The creative possibilities
technique for notation in Western music. At the same                are defined by what encapsulations of objects are possible
time, the canonical score systematises music in a way that          and how easy they are to extend. Notations defined for a
                                                              200
specific purpose fix the possible abstractions and ways of            [23] offers end-user programming for music using an
working. However, the opportunity to define new                       extended implementation of spreadsheet-style formulae.
abstractions (e.g. in terms of existing ones) offers the user
a way to develop their own toolset and facilitates the                4.16 Virtuosity / Learnability
building of more complex solutions (e.g. by abstracting               “How easy is it to master the notation? Where is the
low-level detail), and helps to personalise and raise the             respective threshold for novices and ceiling for experts?”
creative ceiling of a system. [6] In programming, exam-
                                                                         This dimension assesses the learnability of the nota-
ples include defining custom functions and abstract data
                                                                      tion, and whether it engenders a scalable learning curve –
types (objects). In end-user computing, systems may
                                                                      that is, a “low threshold” for practical use by beginners, a
support automation, macros, or plugins to enable users to
                                                                      “high ceiling” for flexible expression by experts, afford-
add new functionality. Simpler abstraction mechanisms
                                                                      ing “many paths” by which users can express themselves.
such as grouping and naming elements are also possible.
                                                                      In addition to supporting multiple levels of expertise and
   [Related dimensions: visibility, closeness of mapping,             creativity, virtuosity should be understood in terms of the
role expressiveness, conciseness/diffuseness, consistency]            balance of challenge and ability experienced by the user.
                                                                      A slight challenge, relative to their ability, intrinsically
   SCORE: In sketching the piece during the creative pro-
                                                                      motivates users and helps create the conditions for flow.
cess, composers are able to appropriate or invent new
                                                                      [3, 9, 11, 22] Too much challenge and users become
terminology of notation technique to describe music more
                                                                      anxious; too little and they become bored. The best
concisely (composer shorthand) or to encapsulate uncon-
                                                                      model for systems are based around “simple primitives”
ventional musical devices and practices – only when it is
                                                                      (building blocks) that can be easily understood by begin-
transcribed for communication to a performer (or com-
                                                                      ners, but flexibly combined to form more complex
puter) must it conform to established notational forms.
                                                                      abstractions and functionality. [6]
     The canonical score format is more limited; designed
around common practices and conventions in formal                        [Related dimensions: consistency, prog. evaluation,
music, but offers some support for grouping mechanisms                role expressiveness, closeness of mapping, error prone-
(e.g. brackets, phrasing) and abstraction (e.g. custom                ness]
ornaments). However, a composer can use the preface to                  SCORE:    The score has a steep learning curve and
a score to introduce original notation techniques and                 beginners require formal tuition and practice to master it.
syntax, to instruct the performer’s interpretation.                   Novices can be discouraged from learning music by
   MAX/MSP: As a programming language, abstraction is a               the literacy entry threshold. [3] The complexity of the
key technique for building and maintaining complex                    notation reflects its relatively high ceiling and capacity to
solutions. Max offers several provisions for abstracting              flexibly encapsulate a wide variety of musical styles and
and extending code: sub-patches allow embedded and                    pieces, though contemporary and electronic composers
external patches to be nested inside other patches, repre-            can find traditional, formal syntax limiting. [2, 3, 12]
sented as new objects (linked using inlets and outlets);                 MAX/MSP: While programming languages often present
externals use a plugin model to allow new objects to be               a high threshold for novices, Max is explicitly designed
coded in C/C++ with defined inputs and outputs; and                   for musicians, and uses a visual programming model to
presentation mode allows patches to be rearrange, simpli-             appeal to non-coders. Tutorials present beginners with
fied and selectively exposed in end user-oriented UIs.                simple patches that produce useful results, enabling a
  DAW:   Sequencers and DAWs are designed to support                  working knowledge to develop quickly. Innovative
specific working styles in music / production scenarios.              interactive in-program documentation and a strong user
Part editors support low-level editing of notes and other             community supports both learners and practitioners.
musical events. In other screens, higher-level abstractions           There are aspects of the environment that also impede
are used to structure music (tracks, parts, etc.), with some          learning (see consistency, error proneness and hidden
provision for grouping and organising objects (e.g. group             dependencies). The creative ceiling for developing audio
channels, folders, track segments). Most packages also                and music systems in Max is high, further supported by
support audio plugin formats that extend FX processing                abstraction mechanisms – though audio programmers
and synthesis options. However, few sequencers support                and more music-oriented users may graduate to other
more flexible abstraction mechanisms to facilitate inter-             tools (e.g. C/C++, OpenMusic, SuperCollider).
action with notation, such as macros, scripting, or auto-               DAW:    Music and audio production packages are de-
mation. Exceptions to this include Live, which can be                 signed to provide a low threshold for musicians and those
integrated with Max, CAL Script in Cakewalk SONAR,                    familiar with studios. The use of visual metaphor and
and Sibelius plugins. In the tracker domain, Manhattan                direct manipulation principles allows knowledge transfer
                                                                201
from these other practices [4], though users without such
backgrounds may struggle. Packages provide a wide
array of tools and features for a variety of purposes,
though few users will have need of all features. The ceil-
ing for musical creativity is relatively high, within the
confines of conventional practices, though UIs are often
optimised for specific workflows and techniques, and
users are largely dependent on software developers to
provide new opportunities for expression. Unlike the
traditional score, and programming languages (like Max),
users efforts to master authoring packages can be frus-
trated by a lack of continuity between versions.
                                                                    Figure 1 Cognitive dimension and flow profiles of music tools, based
        5. PRACTICAL METHODOLOGIES                                  on quantitative user testing [3, 11].
This section briefly surveys existing applications of the              nine components of flow. Using cross-correlation and
Cognitive Dimensions of Notations in musical contexts,              multiple-regression analysis, the results for individual
highlighting both qualitative and quantitative methods for          flow components and dimensions of the notation were
analysing notations and interaction.                                used to identify the key properties of notations facilitating
   Blackwell (with others [7-11, 16, 20, 24]) has used              flow, findings of which can be used to guide the design of
cognitive dimensions to highlight aspects of musical                immersive or embodied interaction systems. The study
interaction in several settings, including music typeset-           [3,11] suggests that key dimensions in the support of flow
ting software [10, 20], programming languages [16, 24],             were visibility (visual feedback), progressive evaluation
and digital tools for composition (e.g. sequencers, track-          (audio feedback) and consistency (support for learning
ers) [8-11]. In such treatments, the framework provides a           and sense of control) – as well as virtuosity (balance of
language for discussing the affordances of notation, but            skill and ability), abstraction management (high creative
has also lead to the development of tools to elicit feed-           ceiling), viscosity (ease of editing), premature commit-
back from end-users, such as questionnaires that probe              ment (freedom of action) and role expressiveness
dimensions in user-friendly, accessible language. [10]              (support for learning). The findings were used to propose
McLean’s work on music and art programming languages                a set of design heuristics for music systems based around
similarly applies and develops the framework for analysis           the concept of virtuosity, rather than usability (see [3, 9]).
of new music coding notations and interfaces. [21]
      Nash [3, 9, 11] extended previous qualitative analy-                               6. CONCLUSIONS
sis techniques to develop a quantitative approach to eval-
                                                                        This paper has presented a musical reworking of the
uating notations. Using a Likert scale, each dimension is
                                                                    Cognitive Dimensions of Notations usability framework,
formulated as a statement that users can agree or disagree
                                                                    and suggested methods and tools for using it to analyse
with to a greater or lesser extent. The mean response
                                                                    music notations, interfaces, and systems. Several applica-
from a large sample of users can then be used to plot a
                                                                    tions have been identified that use the framework to pro-
dimensional profile of the notation under evaluation.
                                                                    vide insight into the human factors of notation-mediated
Figure 1 shows profiles for a survey of various music
                                                                    musical systems, including creativity, virtuosity and flow.
sequencer tools (n=245), not only highlighting relative
                                                                          Future work will focus on further use and develop-
strengths and weakness with respect to properties of each
                                                                    ment of the framework, including its application to other
UI, but also revealing a general profile for music systems,
                                                                    music interaction scenarios and systems, the evaluation of
where the trend may indicate the desired polarity of each
                                                                    new dimensions, and research of other dimensional pro-
cognitive dimension in music interaction. Moreover, the
                                                                    files in other music interactions. The growing intersection
approach was combined with psychometric-style surveys
                                                                    of music and programming practice is also likely to re-
of the experience of creative flow [22], using a battery of
                                                                    veal other parallels between these creative domains that
questions to also measure users’ subjective experience of
                                                                    can further inform both theory and practice.
                                                                    Acknowledgments
                                                                      The author wishes to thank all those who supported
                                                                    and contributed to this research, especially Alan Black-
                                                                    well, Ian Cross, Darren Edge, Sam Aaron, the Cambridge
                                                                    Centre for Music & Science (CMS), and many other
                                                              202
researchers exploring and developing the CD framework            [12] D. Collins, “A synthesis process model of creative
in other domains. This research was funded by the Harold              thinking in music composition,” Psychology of
Hyam Wingate Foundation and UWE (Bristol).                            Music, 33, 2005, pp. 192-216.
                                                                 [13] P. Desain et al., “Putting Max in perspective,” CMJ,
                  7. REFERENCES
                                                                      17(2), 1993, pp. 3-11.
[1] T. R. G. Green, and M. Petre, “Usability analysis of
    visual programming environments: a ‘cognitive di-            [14] J. A. Paradiso, and S. O’Modhrain, “Current Trends
    mensions’ framework,” Journal of Visual Lan-                      in Electronic Music Interfaces. Guest Editors’ Intro-
    guages & Computing, 7, 1996, pp. 131-74.                          duction,” JNMR, 32(4), 2003, pp. 345-349.
[2] G. Read, Music notation: a manual of modern prac-            [15] S. L. Tanimoto, “VIVA: A visual language for im-
    tice, 1979, Taplinger Publishing Company.                         age processing,” Journal of Visual Languages &
                                                                      Computing, 1(2), 1990, Elsevier, pp. 127-139.
[3] C. Nash, Supporting Virtuosity and Flow in Com-
    puter Music, PhD, University of Cambridge, 2011.             [16] L. Church, C. Nash, and A. F. Blackwell, “Liveness
                                                                      in notation use: From music to programming,” 22,
[4] M. Duignan, Computer mediated music production:                   2010, Proceedings of PPIG 2010, pp. 2-11.
    A study of abstraction and activity, PhD, Victoria
    University of Wellington, 2008.                              [17] M. Leman, Embodied music cognition and media-
                                                                      tion technology, MIT Press, Camb., MA, 2008.
[5] M. Graf, From Beethoven to Shostakovich, New
    York: Philosophical Library, New York, 1947.                 [18] A. J. Sellen, and R. H. R. Harper, The myth of the
                                                                      paperless office, MIT Press, 2002.
[6] M. Resnick et al., “Design principles for tools to
    support creative thinking,” 2005, NSF Workshop of            [19] T. Amabile, “The social psychology of creativity: A
    Creative Support Tools, pp. 25-36.                                componential conceptualization,” Journal of person-
                                                                      ality and social psychology, 45(2), 1983, pp. 357-76.
[7] A. Blackwell, “Dealing with new cognitive dimen-
    sions”, Workshop on Cognitive Dimensions, Uni. of            [20] A. F. Blackwell, T. R. G. Green, and D. J. E. Nunn,
    Hertfordshire, 2000.                                              “Cognitive Dimensions and musical notation sys-
                                                                      tems,” 2000, ICMC 2000: Workshop on Notation
[8] C. Nash, and A. Blackwell, “Tracking virtuosity and
                                                                      and Music Information Retrieval in the Comp. Age.
    flow in computer music,” 2011, Proceedings of
    ICMC 2011, pp. 572-582.                                      [21] A. McLean, Artist-programmers and programming
                                                                      languages for the arts, PhD, Goldsmiths, 2011.
[9] C. Nash, and A. Blackwell, “Flow of creative inter-
    action with digital music notations,” in The Oxford          [22] M. Csikszentmihalyi, Creativity: Flow and the Psy-
    Handbook of Interactive Audio, 2014, OUP.                         chology of Discovery and Invention, HarperCollins,
                                                                      New York, 1997.
[10] A. Blackwell, and T.R.G. Green, “A cognitive di-
     mensions questionnaire optimized for users,” 2000,          [23] C. Nash, “Manhattan: End-User Programming for
     Proc. of PPIG 2000, pp. 137-152.                                 Music,” 2014, Proc. of NIME 2014, pp. 28-33.
[11] C. Nash, and A. Blackwell, “Liveness and flow in            [24] A. Blackwell, and N. Collins, “The programming
     notation use,” 2012, Proc. of NIME 2012, pp. 28-33.              language as a musical instrument,” 2005, Proceed-
                                                                      ings of PPIG05, pp. 120-130.
                                                           203
            TUFTE DESIGN CONCEPTS IN MUSICAL SCORE CREATION
                                                         Benjamin Bacon
                                                 IDMIL, CIRMMT, McGill University
                                               [email protected]
1. INTRODUCTION
                                                                         204
more ink that is used on non-representative graphics, the               be communicated graphically to the performer, making care-
more the designer endangers the clarity of the graphic’s in-            ful distinctions between the representative elements of dy-
tent.                                                                   namic, technical, rhythmic, timbral, and pitched content.
  In each of his publications, dozens of techniques and ex-             Further distinctions between micro- and macro-formal trends
amples are shown describing how to layer, differentiate,                are often useful for performers, and add more demands to
and communicate information efficiently. The notions of                 the score.
chartjunk and data-ink ratios hold at their core the idea that
graphical representations of information should maximise
the space which they inhabit. This promotes the efficient                        2. EXAMPLES IN SCORE-DESIGN
transfer of ideas, un-obscured by the noise of poor design.
                                                                        The following sections will discuss three examples of Tuftian
Without a careful understanding of how graphical content
                                                                        design theories employed by the author in his own compo-
interacts with itself, and indeed, other content, it is easy to
                                                                        sitions, including one work from a scientific study. These
obfuscate critical concepts while simultaneously detract-
                                                                        examples will discuss specific musical ideas, and how in-
ing thinking and attention from the observer. While min-
                                                                        formation design theories can provide the composer with a
imalism may come to mind when adopting Tufte’s design
                                                                        useful toolbox for finding innovative solutions to graphical
strategies, he does not advocate for the avoidance of com-
                                                                        demands.
plexity. In fact, many examples are dense and complicated,
showing at times thousands of data-points in a single im-
age.                                                                    2.1 Graph-based Notation in de Chrome
  Above all, Tufte’s principles ask the designer to employ
techniques relevant to the cognitive task at hand. It is up             The first example inspired by Tufte’s writings on informa-
to the designer to choose a methodology which commu-                    tion design is a piece entitled de Chrome written by the
nicates the primary message of the presented information.               author in 2012, seen in Figure 2. This composition gives
As Tufte clearly states [8]:                                            the performer a role in shaping the piece on a micro-level,
                                                                        while the larger form is dictated. Choices can be made
      If the thinking task is to understand causal-                     by the performers on which content to perform and which
      ity, the task calls for a design principle: show                  pitches to sound, but are limited to a sub-phrasal level.
      causality. If a thinking task is to answer a                      Overall, the piece is to be performed by 3-5 players on
      question and compare it with alternatives, the                    any instrument.
      design principle is show comparisons.                               This piece is comprised of graphical sub-phrases grouped
                                                                        together by page, which can be seen in Figure 2aa. Each
1.2 Information Design and the Score                                    graph depicts the dynamic contour of a sustained sound,
The inclusion of diverse graphical examples in Tufte’s books,           with dynamic references located on the y-axis. The dura-
spanning across many different disciplines and points in                tion of each sub-phrase is indicated by the two numbers
history, is an effort to showcase the universality of his the-          in the top left-hand corner page, best seen in Figure 2bb.
ories. These graphic examples are meant to inspire and                  In this case, the performers are instructed to choose any
provoke thought on a variety of possible scenarios when                 three sub-phrases (with no repeats), and to perform each
designing, while also demonstrating how a technique can                 one for thirty seconds; hence the notation of 3x30”. After
be used in a given discipline. Information design is inter-             the completion of the phrases, the performers may move
disciplinary in nature, and can be applied to nearly any-               onto the next page. The gradient shading refers to the level
thing where information is represented graphically.                     of timbral pressure exerted in each phrase. Black indicates
  Therefore, it would not require any stretch of the imag-              a heavy amount of force while white indicates little or no
ination to apply Tufte’s principles when creating a mu-                 extra force. This can correspond to the pressure exerted on
sical score. Composers, especially of contemporary mu-                  the bow, embouchure, etc.
sic, work with an ever-expanding palette of sonic param-                 The small box beneath the contour-graph contains the
eters [9]. While most composers are not specifically at-                group of pitches the performer may choose from when
tempting to display data or evidence for something that al-             sounding the sub-phrase. Note-heads with no indicated
ready exists, visuals are employed to provide reasoning for             pitch are to be interpreted as non-pitched sustained sounds.
something that is about to happen (i.e. a performance).                 This simply means the performed sound must, above all,
  Today, extended techniques, electronic processing, and                not contain any tuned-pitch as detuned sounds are appro-
even new instruments themselves place interesting demands               priate. Outside the box, percussive articulations are marked
on the composer. In many cases, complicated ideas must                  with the + symbol.
                                                                  205
            3x30’’                                                        line. Grey-scales are superior to colouring techniques in
       ff
       f
                                 ff
                                 f
                                                      3
                                                                          displaying hierarchical content. As mentioned by Tufte
       m
       p
                                 m
             ,-! +                    ,2#
       ff                        ff                                       for increased viewing resolution in a smaller space, and
       f                         f
       p
                                 m
                                 p
                                                                          using less ink. An additional benefit to the increased in-
      pp                        pp
                                                                          formation resolution of the employed visual techniques in
             ,56 + +
                                                                          de Chrome is the opening of white space for extra notation
                                      -2!
       ff                        ff
       m
                                 f
pp pp
                                                                    206
ing its own content on a different instrument using differ-                    ducing administrative elements, while simultaneously in-
ent techniques. This division was the primary goal of the                      creasing the data-ink ratio on the page. The system of nota-
composition, as it was intended to explore the physical dif-                   tion in Dextral Shift places weighted emphasis on the most
ficulties of combining and separating the hands from each                      important elements of the score, while enabling graphical
other.                                                                         flexibility. The removal of staff lines clears-up the page
                                                                               and avoids the activation of negative space. Unused grid-
                                                                               space, especially those with pronounced lines, can often
                                                                               be confusing [12]. The unnecessary interaction of graph-
                                                                               ical elements can make the discernment of important in-
                                                                               formation from non-important/existing information quite
                                                                               difficult [13].
                                                                         207
Figure 4. This is an excerpt from the handedness study music. The underlined segments highlight the areas where Tufte’s concepts on
disinformation design were employed.
of disinformation design, found in the book Envisioning             beat. Following the quintuplet figure is another syncopated
Information [3] was used.                                           rhythm. The visual presentation of the syncopation is par-
  Disinformation design is in most ways the complete op-            tially what makes it challenging. The isolated 8th-note on
posite of information design; the goal is obscure the truth         the quintuplet leads into the syncopated rhythms consisting
or to produce an illusion. In Tufte’s book Visual Explana-          of a 3/8 feel over several 2/4 measures. The graphical rep-
tions, an entire chapter is co-authored with magician Jamy          resentation of the last quintuplet note in measure 13 leads
Ian Swiss, as various examples of technical explanations            into the next bar, much like the 8th-note in measure 15.
and magic guidebook diagrams are discussed in detail. The           While they look similar, one is bound to the beat-matrix
notational methodology for the excerpt presented in the             while the other is not. This entire system (mm.13-17), was
handedness study specifically disguised the introduction of         one of the most complicated segments to read. A clear ma-
new, and more challenging material.                                 jority of participants performed these measures with only
  In Figure 4, the underlined segments highlight areas where        their preferred-hand.
disinformation design tactics were employed. The first
underline (measure 11) is a syncopated 3/4 pattern using                                 3. DISCUSSION
16th-notes. The measure begins with two 8th-notes, estab-
lishing a firm rhythmic foundation for the measure. The             Musical content is represented graphically in a diverse and
syncopated rhythm is written using 16th-notes only for the          varied landscape, full of rich historical context and tra-
note-heads, and a collection of 16th-note and 8th-note rests        ditions. In musical notation, there is no right or wrong
for the spaces in between. The busy notation of this mea-           way to pursue or represent an idea. Composers often work
sure obscures a sense of regularity, and masks the metric           with abstracted concepts in a visceral way, which is in turn
identity of the measure. The graphical repetition of the            reinterpreted by the performers and the audience. Con-
16th-note rest pairs serves as a kind of notational anomaly.        versely, information design is often grounded in verifiable
When two rests are paired of equal value, they are usually          data. Graphical elements are used in information design to
grouped into one. Using two rests requires more subdivid-           give form to numbers and reveal trends. Information de-
ing by the performer which is reliant on internal counting          sign is concerned with the visual presentation of evidence.
strategies.                                                         For these differences, perhaps musical notation techniques
  Measure 11 was repeatedly misplayed by the participants           may have remained separate from the quantitatively-driven
of the handedness study, as many had to suddenly dial-in            world of information design. Music’ s representation on
on the resolution of their internal counting, which usually         paper is entirely arbitrary, and often self-containing. In
happened too late. The designed error-zone provided an              contemporary music, the composer devises a new graphi-
opportunity to observe which hand would be used when in-            cal language for each piece [17], largely shaped by what
tensified timing-based decisions needed to be performed.            the composer wants to communicate.
Consistent with previous findings on the matter [16], the             Quantitatively speaking, the way in which traditional mu-
preferred-hand performed most of the notes in this mea-             sic is represented revolves around a grid-based system, where
sure.                                                               exact information can be presented. Timing and pitch can
  The second underlined segment seen in Figure 4 begins             be precisely notated, but this system has been challenged
with a quintuplet figure with an 8th-note rest on the fourth        to a great extent due to its limitations in displaying highly-
                                                              208
specified technical information [18]. The limitations of                 [3] ——, Visual Explanations: Images and Quantities, Ev-
grid-based notation gave way to graphical-based methods.                     idence and Narrative. Cheshire, CT: Graphics Press,
Interestingly enough, the two systems have generally been                    1997.
segregated and have been thought to conflict with one an-
                                                                         [4] ——, Beautiful Evidence.      Cheshire, CT: Graphics
other [19].
                                                                             Press, 2006.
                    4. CONCLUSION
                                                                         [5] ——, The Cognitive Style of PowerPoint.       Graphics
This paper sought to explore the possibilities of combining                  Press Cheshire, CT, 2003.
information design tactics, most notably those of Edward
                                                                         [6] Rudolf Schmid, “The Visual Display of Quantitative
Tufte, and musical composition. Tufte’s books trace com-
                                                                             Information by E. R. Tufte,” Taxon, vol. 38, no. 3, p.
mon mistakes and important solutions in presenting infor-
                                                                             451, 1989.
mation throughout history. These examples and lessons
can provide musicians with a rich resource for solutions                 [7] A. Bose, “Visual Explanations by Edward Tufte,” The
to displaying challenging musical material. As previously                    Indian Journal of Statistics, vol. 59, no. 3, 1997.
mentioned in Section 1.1, one of the primary questions
designers of graphical content should ask is: What is the                [8] M. Zachary and C. Thralls, “An Interview with Ed-
thinking task? The graphical representation of any given                     ward R. Tufte,” Technical Communication Quarterly,
idea should help aid that task.                                              vol. 13, no. 4, pp. 447–462, 2004.
  The examples presented herein were solely produced by
                                                                         [9] T. Murail, “The Revolution of Complex Sounds,” Con-
the first author, as it was not the goal of this article to cri-
                                                                             temporary Music Review, vol. 24, no. 2, pp. 121–135,
tique the works of others, or to highlight what makes a
                                                                             January 2007.
score good or bad. Composers are free to use any method
or system necessary to express themselves, and in most               [10] B. Bacon and M. M. Wanderley, “The Effects of Hand-
cases the ends justify the means. Traditional western no-                 edness in Percussion Performative Gesture,” in Pro-
tation is highly customisable, and serves as the framework                ceedings of the 10th International Symposium on Com-
for a great deal of the contemporary music written today.                 puter Music Multidisciplinary Research, 2013, pp. 554
It is, at its core, a highly successful and excellent example             – 560.
of information design. Furthermore, while it has the po-
                                                                     [11] M. Corballis, Human Laterality.      Elsevier, 1983.
tential to be graphically difficult to navigate, western no-
tation’s visual grammar is widely recognized and familiar            [12] M. Schrauf, B. Lingelbach, and E. R. Wist, “The Scin-
to its users. The difficulties of understanding its systemic              tillating Grid Illusion,” Vision Research, vol. 37, no. 8,
structure are usually overcome in the early stages of a mu-               pp. 1033–1038, 1997.
sicians career, allowing the experienced and professional
to transcend any possible limitations of the notation in their       [13] J. Albers, “One Plus One Equals Three or More: Fac-
music making.                                                             tual Facts and Actual Facts,” Search Versus Re-Search,
  Tufte’s work is at its heart multidisciplinary, leaving an              pp. 17–18, 1969.
open framework for the interpretation of musicians. His
                                                                     [14] “Handedness    in Percussion    Sight-Reading,”
ideas open the door to countless other persons and organi-
                                                                          website,    July  2014.     [Online].    Avail-
zations who have discovered solutions to complex graph-
                                                                          able: http://dl.acm.org/citation.cfm?
ical questions. Musical composition is certainly compli-
                                                                          id=2617995&coll=DL&dl=GUIDE
cated and multidisciplinary as well. Inspiration can be
drawn from anything when writing music. The work of                  [15] G. Cook, Teaching Percussion.          Schirmer Books,
Edward Tufte and the world of information design has the                  1988.
potential to be a rich resource for imaginative compositions
in the future.                                                       [16] M. Peters, “Hand Roles and Handedness in Music:
                                                                          Comments on Sidnell.” 1986.
                    5. REFERENCES
                                                                     [17] C. Cardew, “Notation: Interpretation, etc.” Tempo, New
[1] E. R. Tufte, The Visual Display of Quantitative Infor-                Series, no. 58, pp. 21–33, 1961.
    mation, 2nd ed. Cheshire, CT: Graphics Press, 1982.
                                                                     [18] E. Brown, “The Notation and Performance of New Mu-
[2] ——, Envisioning Information. Cheshire, CT: Graph-                     sic,” The Musical Quarterly, vol. 72, no. 2, pp. 180–
    ics Press, 1990.                                                      201, 1986.
                                                                   209
[19] P. Boulez, Orientations: Collected Writings. Harvard
     University Press, 1990.
                                                            210
    NOTATION AS INSTRUMENT: FROM REPRESENTATION TO
                       ENACTION
                                                                       211
      TANGIBLE SCORES AND GESTCOM AS                              Graphic scores as proto-inherent scores
         COMPOSER AND PERFORMER
                                                                  Tomás and Kaltenbrunner traces back the development
        PERSPECTIVES RESPECTIVELY
                                                                  of the notion of inherent scores in the 1960s and in
Tomás-Kaltenbrunner Tangible score: inherent score                particular in graphic scores.
and multi-morphophoric sound elements                                The NIME designer programs the affordances. In this
                                                                  sense the instrument tends to be part of the composition,
The question of an inherent-in-the-instrument score
                                                                  exactly as a graphic score was in the 50s or 60s. These
frames in new terms the problem of interaction design
                                                                  scores are interfaces of interaction with the instruments:
and affordance exploration of instruments and notations
                                                                  The sound result is open, but conducted by the graphic
alike1.
                                                                  constructions prescribed by the score. Inherent scores
   The problem of a differentiation between scores and
                                                                  are similar to graphic scores, despite the fact that the
interfaces is largely debated in the NIME community. A
                                                                  first are are sound producing and performable while the
NIME designer develops a notation system that is
                                                                  latter are only representational. As remembered by
inherent to the instrument. The designer thus cancels the
                                                                  Tomás and Kaltenbrunner :
difference between music composer and instrument
maker: the score is the instrument. The definition is               […] performing became the creative exploration in
                                                                    freedom of the musical affordances, musical reactions
compatible with Atau Tanaka’s definition of instruments
                                                                    or acoustic relations to the physical space performed,
as open-ended systems, whose architecture includes a                without the need of any kind of musical notation.
structural-compositional layer, next to the input and
                                                                     In this sense, inherent scores are evolutions of
output systems, mapping algorithms and sound
                                                                  graphic scores, conceived as musical interfaces.
synthesis systems.[5]
                                                                  Composers design the instrument, after Lachenman's
   The example provided by Tangible Score highlights
                                                                  motto: “Konponieren heißt: ein Instrument bauen”.
very well this particular evolution. He claims that
                                                                     The tangible score is the result of a compositional
different layers, namely the instrument and the score,
                                                                  process that enacts gestures and strategies:
accompany the interaction between the composer and
the performer. However, the evolution of electronic                 We define a tangible score as the physical layer that is
instruments implies a radical change in this perspective:           incorporated into the configuration of a digital
                                                                    instrument with the intention of conducting the tactile
the construction of the instrument is not only an                   gestures and movements.
instrument-maker realization, but it becomes an act of
                                                                    Thus, the tangible score influences and orients the
composing.
                                                                  process of enactment of the instrument: it affords tactile
   Inherent scores are in this sense an expansion of
                                                                  gestures and movements. In this sense this instrument
what an instrument normally is: these instruments
                                                                  embodies gestural scores.
expand and reinforce their affordances, turning into
                                                                    However Tomás and Kaltenbrunner focuses mainly
objects acting in the sense of musical composition. The
                                                                  on the physical interaction, avoiding the problem of the
instrument implies gestures and sounds, exploding in a
                                                                  acoustic one. For him tangible score,
multiplicity of instrumental morphophoric elements.
Duchez defines morphophoric: “The notion herein                     as a traditional score, it encodes a musical intention and
referred to as morphophoric - or form-bearing -                     delegates the decoding part to other agents.
element, has always and unfailingly guided musical                  That is partially true: a traditional score implies
action, that is to say strategies of production                   sounds that a gestural one does not. The score of a
(inspiration, invention, representation, execution) and           violin sonata is an encoding of the intention via the
reception (listening, memorization). But this essential           gestures, that leaves the decoding to another agent.
guidance is first of all only a more or less conscious,           However we must remark that we can't program
empirical practice based on immediate perception. Its             differently the sound of a violin. In this sense the
efficiency, therefore, though direct and reliable, is             tangible score is not exactly traditional, but rather an
limited, and it corresponds to what are generally called          exciting new extension of traditional possibilities. Each
"primitive", orally-transmitted musics” [6].                      instrument has compositional constraints, but, until
                                                                  now, instruments are the result of historical and
                                                                  intersubjective evolution based on fundamental
                                                                  morphophoric elements – like pitches -; the tangible
                                                                  score, as mosts of NIMEs, is design on open
                                                                  morphophoric elements, that can be chosen by the
  1                                                               composer or the performer, inventing in that manner
   A demo of Tangible Score is available at :
  http://vimeo.com/80558397.                                      different possible arrangements of the score.
                                                            212
GesTCom (Gesture             Cutting    through     Textual         time interaction using any OSC capable application or
Complexity)                                                         device (typically Max/MSP, Pure Data, but also
A different example of a shared multimodal platform                 programming languages like Python, CSound, Super
which amalgamates instrument, gesture and notation is               Collider, etc.)
the GesTCom. Its novelty lies in that it highlights the                A textual version of the OSC messages that describe
enactive potential of traditional musical scores from a             a score constitutes the INScore storage format. This
performer- specific (rather than composer-specific)                 textual version has been extended as a scripting
perspective.                                                        language with the inclusion of variables, extended OSC
    It was developed in the course of a musical research            addresses to control external applications, and support
residency 2013-2014 at the Ircam, as a prototype system             for embedded JavaScript sections.
based on the a. performative paradigm of embodied                      All these features make INScore particularly suitable
navigation of a complex score b. on the INScore                     to design music scores that need to go beyond
platform and c. on the Gesture Follower [7]. The                    traditional music notation and to be dynamically
concept of corporeal (or embodied) navigation attempts              computed.
to offer an embodied and medial performer-specific                     c. The Gesture Follower was developed by the
alternative to the classical UTI2 paradigm. Instead of a            ISMM Team at Ircam [9, 10]. Through the refinement
strictly linear arrangement of its formants -                       of several prototypes in different contexts (music
understanding notation, then employing purposefully                 pedagogy, music and dance performances), a general
technique and then allowing, in the end, for expressive             approach for gesture analysis and gesture-to-sound
interpretation-, it proposes the conceptualization of               mapping was developed.
learning and performance as embodied navigation in a                   The “gesture parameters” are assumed to be multi-
non-linear notational space of affordances: The                     dimensional and multimodal temporal profiles obtained
performer “moves” inside the score in several                       from movement or sound capture systems. The analysis
dimensions and manipulates in real-time the elements of             is based on machine learning techniques, comparing the
notation as if they were physical objects, with the very            incoming dataflow with stored templates. The creation
same gestures that s/he actually performs. This                     of the templates occurs in a so-called learning phase,
manipulation forms indispensable part of the cognitive              while the comparison of a varied gesture with the
processes involved in learning and performing and                   original template is characterized as following.
transforms the notation. This transformation can be                    The GesTCom, equally rooted on embodied
represented as a multilayered tablature, as in Figure 1.            navigation, INScore and Gesture Follower, takes the
    b. INScore [8] is an open source platform for the               form of a sensor-based environment for the production
design of interactive, augmented, live music scores.                and interactive control of personalized multimodal
    INScore extends the traditional music score to                  tablatures out of an original score. As in the case of
arbitrary heterogeneous graphic objects: symbolic                   Embodied navigation (Figure 1), the tablature consists
music scores but also images, texts, signals and videos.            of embodied representations of the original (Figure 2).
A simple formalism is used to describe relations                    The novel part is, that those representations derive from
between the graphic and time space and to represent the             recordings of an actual performance and can be
time relations of any score components in the graphic               interactively controlled by the player. The interaction
space on a master/slave basis.                                      schema takes the following feedback loop form
    It includes a performance representation system                 (Figure 3).
based on signals (audio or gestural signals).                          More specifically, the input performative gesture
    It provides interaction features provided at score              produces four types of recorded datasets (gestural
component level by the way of watchable events. These               signals, audio, MIDI and video), which are
events are typical UI events (like mouse clicks, mouse              subsequently used for the annotation, rewriting and
move, mouse enter, etc.) extended in the time domain.               multimodal augmentation of the original score. Those
    These interaction features open the door to original            output notations are embodied and extended: They are
uses and designs, transforming a score as a user                    produced through performative actions, they represent
interface or allowing a score self-modification based on            multimodal data, they can be interactively controlled
temporal events.                                                    through gesture and they can dynamically generate new
    INScore is a message driven system that is based on             varied performances.
the Open Sound Control [OSC] protocol. This message-
oriented design is turned to remote control and to real-
2
    Acronym for Understanding-Technique-Interpretation
                                                              213
  Figure 1. The embodiment of a Xenakian cloud / fingers-, hand-, and arm-layer in 1b, 1c, 1f respectively
Figure 2. INScore tablature of combined representions. They can be synchronized with video and audio and interactively
                          controlled. The player navigates between the several representations
                                                          214
   They can be considered as the visualization and                   A milestone in subsequent developments towards the
medial extension of the player’s navigation in the score-         representation of independently organized, or
space, creating an interactive feedback loop between              decoupled, actions towards indeterminate sound results
learning and performance.3                                        is offered by the work of Klaus K. Hübler, and in
                                                                  particular his article “Expanding String Technique”
                                                                  [11]. There, Hübler soughts to present a “completely
                                                                  new perspective on the instrument” through “an
                                                                  expansion of sound and technique that has its roots in
                                                                  the specific resources of the instrument and its manner
                                                                  of performance”.
                                                            215
                                     Figure 4. Aaron Cassidy's Second String Quartet page
   We try to argue that scores, instruments and                      Making of musical instruments involves action and
compositions seem to have a common essence. If scores             perception; it also involves the understanding of the
are, or might be, abstract symbolic interfaces, and               action-relevant value of sounds, the judgment of these
instrument concrete ones, we highlight how the recent             sounds in view of musical ideals, and shaping physical
evolution of new musical interfaces seems to make the             environment that produces sound: projections of
limit fluid.                                                      movements in virtue of the absence of physical
   Scores and instruments not only collimate today in             presence.
multimodal interfaces, but have, in our opinion, a                    The composer, the performer and the instrument-
common essence characterized by the typology of                   maker project the sound-object in time: they must
intentionality, based on the effort of projection of the          project their subjective experience in an intersubjective
maker: composition of scores or construction of                   dimension. Projection is expectation of reality based on
                                                                  past experience.
                                                           216
   The action-reaction cycle proposed by Marc Leman                 Notational system as performed system
as a paradigm for instrument making (and more widely,                   We would like to suggest a framework of the
for music making and perception), frames theoretically,             definition of score as instrument, drowing a line
for us, this concept [12]. If the process of instrument             between the programming of the sound result and the
making described by Leman as the synergic relationship              design of instrument and scores. We would like to argue
between “Play, Listen, Judge and Change” is true, then              that if scores are instruments, then this common essence
the process of composition can be equally described. In             is still developed in NIMEs.
fact                                                                    As highlighted by Tomás and Kaltenbrunner, circuits
  While musical instrument is being built, a set of action-         are conceived as scores and instruments, because their
  reaction cycles, which may occur on different time                combination implies specific sounds. This relationship
  scales and perhaps in hierarchical order, transforms
                                                                    is at the basis of the conception of synthetic
  matter and energy into a cultural artefact for making
  music [Leman, 2007: 52]                                           instruments. Also for Max Mathews, computer is an
                                                                    instrument [13]; at the same time the computer is not a
   There are forms of projections through writing that
                                                                    normal instrument, but it performs data that are
evolve in technology. Performers and composers are
                                                                    memorized and activated.
entailed in a similar form of projection, characterized by
                                                                        In the case of NIMEs, the computer is still central.
a different degree of distance from the gestural and
                                                                    The computer controls the loudspeaker, but the musical
sonic output. The projection is the conception of a
                                                                    interface controls the computer. It is a particular
process of accumulation of experience that comes to
                                                                    instrument that not only can be controlled by interfaces,
define the good shape of the instrument and of the
                                                                    like keyboards controls organs, but it can be
score. Leman underlines the process as the Ratchet
                                                                    programmed in infinite manners.
Effect:
                                                                        The interfaces have a role similar to that of scores:
  […] the actual process of instrument-making                       they generate information in real-time, but still record
  (ontogenesis) and the history of how an instrument                and encode data: interfaces are causal for scores.
  evolves (phylogenesis) can be seen as the result of a
  repeated cycling of Play, Listen, Judge, Change. The                  Anne Veitl [14], following Cadoz's work [15],
  action-reaction cycling is a dynamic model with the               focuses on the notion of causality, that is the central
  capacity to subsume a cumulative process similar to a             element of the relation between scores and instrument.
  ratchet effect [Leman, 2007: 54]
                                                                    The comprehension and the definition of causality lies
   In our opinion, we can extend this model from                    at the centre of the definition of the musical instrument.
instrument to notation, assuming that in both of them               Veitl's model allows a kind of generalized
perception induces intentionality and anticipation: “the            instrumentality: highlighting the principle of causality
world is conceived from the viewpoint of action and                 fundamental, it becomes evident that instruments and
prediction rather than merely based on the construction             score are part of the same causal process.
of gestalts”.
                                                                    Criteria of a performed notational system
   Scores are the result of a ratchet effect, in the sense
that they simulate the economic growing of knowledge                   Considering the sound synthesis environments
during the last centuries, similarly to the instruments.            partitioned as score and instruments, Anne Veitl
The abstraction of the musical practice in a few number             proposed to interpret softwares as notational systems.
of variables allows a global control of the instruments,               Veitl proposed six criteria that seem to us to highlight
that arrives to a certain control of the body of the                some general properties of notational systems and
performer. This kind of prescriptive approach is similar            instruments at the same time. These criteria stress the
to the machines, that are totally, or almost, controlled.           fact that softwares are notations, and, essentially,
In this sense the composer uses the score as an                     performable notations. A notational system is
instrument, as a temporal and physical interface of                 primarily :
abstract interaction in time and space: scores are                          a. material: it must be somewhere, memorized
extensions of the body of the composer in the body of                           on a concrete and existing object, the paper
the performer via the projection of the instrument                              or a hard disc ;
represented by the score. That creates a singular                           b. visible: that's why the machine language is
temporal dimension based on the absence and presence                            not a notation, but softwares are visible ;
of the instrument: the composer constructs absences and                     c. readable: it has to be read by a machine, a
the performer reconstructs the projected presences.                             human being or both;
                                                                            d. performative: it describes the action
                                                                                potential of a system. Softwares and
                                                                                computers are highly performative because
                                                              217
           the material inscription is translated                  Oxford Handbook of Computer Music. Oxford
           instantaneously in sound;                               University Press, 2009
      e. systemic: the signs, or the physical elements         [6] M.E. Duchez, “An Historical and Epistemological
           of the system can operate structurally ;                approach to the Musical Notion of “form bearing”
      f. causal: notation must indicate and enable                 element”, Contemporary Music Review, 4:1, 199-
           sounds. It must indicate the manner and the             212.
           means necessary to produce the sound or
                                                               [7] P. Antoniadis, D. Fober, F. Bevilacqua, “Gesture
           the event.                                              cutting through textual complexity: Towards a tool
  In this sense, for Veitl, softwares are scores, thus             for online gestural analysis and control of complex
NIMEs are expression of this essential character.                  piano notation processing”, in CIM 14 Proceedings
                                                                   Berlin 2014
                  CONCLUSIONS
                                                               [8] D. Fober, Y. Orlarey, S. Letz: “INScore: An
    Technological advances have broadened our                      Environment for the Design of Live Music
conception of notation and instrument as mutually                  Scores”, Proceedings of the Linux Audio
shaping, action-oriented, open-ended systems, as much              Conference - LAC 2012.
as they have contributed in their actual, material             [9] F. Bevilacqua, N. Schnell, N. Rasamimanana, B.
amalgamation.                                                      Zamborlin, F. Guedy: “Online Gesture Analysis
    Tomás’ tangible score and Antoniadis’ GesTCom                  and Control of Audio Processing”. In: J. Solis & K.
offer instances of new interfaces-and-scores, which                Ng (eds.) Musical Robots and Interactive
have historically followed up from graphic and action-             Multimodal Systems, pages 127-142. Springer-
oriented notations. In those instances, notation and               Verlag, Berlin Heidelberg, 2011.
instrument share common criteria (Veitl) and                   [10] F. Bevilacqua, B. Zamborlin, A. Sypniewski, N.
evolutionary cycles (Leman) beyond the prescriptive-                Schnell, F. Guédy, N. Rasamimanana: “Continuous
descriptive classical dichotomy, materializing both                 realtime gesture following and recognition”. In
representational and enactive cognitive features.                   Embodied Communication and Human-Computer
    Eventually the very communicative chain and roles               Interaction, volume 5934 of Lecture Notes in
between instrument-makers, composers, performers and                Computer Science, pages 73–84. Springer Berlin
computer-music designers are to be genuinely rethought              and Heidelberg, 2010
as cycles of synergy rather than linear models, with           [11] K. K. Hübler, “Expanding String Technique”, pp.
obvious implications for both pedagogy and creation in              233-244 in CS Mahnkopf, F.Cox, W. Schurig
all respective fields.                                              Polyphony and Complexity, Wolke Verlag, 2002.
                                                               [12] M. Leman, Embodied Music Cognition and
                  REFERENCES                                        Mediation Technology, MIT Press, Cambridge,
[1] N. Cook, Beyond the Score: Music               as               2007.
    Performance. Oxford University Press, 2014                 [13] M. Mathews, “The Digital Computer as a Musical
[2] A. Arbo and M. Ruta, Ontologie Musicale, Paris,                 Instrument”, Science, 1963, pp. 553-557.
    Hermann, 2014.                                             [14] A. Veitl, “Musique, causalité et écriture: Mathews,
[3] C. Seeger, “Prescriptive and Descriptive Music                  Risset, Cadoz et les recherches en synthèse
    Writing”, The musical Quarterly, 44, 2, 1958, 184-              numérique des sons”, Musicologie, informatique et
    195.                                                            nouvelles technologies, Observatoire Musical
                                                                    Français, Paris-Sorbonne, 2006, Paris.
[4] E. Tomás, M. Kaltenbrunner, “Tangible Scores:
    Shaping the Inherent Instrument Score”,                    [15] C. Cadoz, “Musique, geste, technologie”, in H.
    Proceedings of the International Conference for                 Genevois and R. de Vivo, Les Nouveaux Gestes de
    New Musical Expression, 2014.                                   la musique, Editions Parenthèses, 1999, Marseille,
                                                                    pp. 47-92
[5] A. Tanaka, “Sensor-based Instruments and
    Interactive Music” pp. 233-255 in R. Dean, The
                                                         218
                       TIMBRAL NOTATION FROM SPECTROGRAMS :
                           NOTATING THE UN-NOTATABLE?
                                                                      219
digital signal processing is extending this exploration into          social function and many other cultural features, but
the electroacoustic context.                                          makes only one reference to the actual sound of the in-
    Further areas of exploration include an articulation of           struments: “The sound produced resembles that of a
the relationship between the sound and context. This                  frog” [3]. Assuming one knows the species of frog being
relationship is reflected in the scope of our definition of           referred to by the author, and what call it is giving, per-
timbre based on Smalley’s approach, and making recog-                 haps this is helpful. A motivating factor in this project is
nition of Lasse Thoresen’s assertion that we need to de-              to try to find an objective, non-metaphorical process for
velop a terminology (and lexicon) to describe the                     notating the sound of the frog. Through spectrographic
“…phenomenology of music in experiential terms” [2].                  measurement we hope, as far as the visual can represent
This phenomenological approach to timbre was initially                the aural, to find symbols and images that can communi-
begun by Schaeffer and then carried forward by Smalley,               cate sound quality in all its complexity to a literate ob-
and between the writings of all three, begins to accom-               server.
modate the multiplicity of meanings of ‘timbre’: structur-            Referring to sound quality – its spectral content, sonic
al, contextual, analytical, tonal, and sound quality.                 identity and recognition of source – Udo Will attests:
    Timbral elements in musique-mixte works are central
                                                                         “…It remains immensely difficult to ‘talk about’ them –
to interpretation and realization in performance, but often              oral cultures have no music theory. Things seem to be dif-
include somewhat vague or technology specific indica-                    ferent in literate cultures, though. Through the very in-
tions. The authors’ experience as performers (flautist and               vention of writing systems, man has acquired means to
organist) in the musique-mixte domain has prompted                       cope with the elusiveness of sounds: the transformation
aspects of this study, and provides a practical basis for                from an aural-temporal form into a visual-spatial one.
these explorations. In flute works, for example, timbre                  Sounds seem to be tamed and time seems more under
changes may be indicated by signs (often extended tech-                  control if treated spatially, however, this is only seeming-
                                                                         ly so because the accomplishments of such a transfor-
niques) or words that can be highly evocative and poetic;
                                                                         mation are limited and can at times be deceiving” [4].
the electronics may be indicated by effects or technical
instructions such as fader control levels, or a particular            Combined with the other informal explorations and con-
form of synthesis. Where acoustic and electronic sounds               siderations these comments became enabling texts to
merge, indications of timbre may become the ‘property’                launch this exploration of timbral notation.
of the software or mixing desk – the programmed effect.                   Central to the project is the music score itself – what is
The authors suggest that a creative collaboration working             it, and what relationships the various participants each
within a performance environment to recreate the com-                 have with this thing or artifact? One common factor in all
poser’s intentions, rather than technical instructions,               our understandings is of the score as an object of poten-
could be more effectively enabled with semiotically rele-             tial. The project is generating new questions and raising
vant timbral representation. In organ works, timbre is                uncertainties about the nature or ontology of musical
often suggested through assumed knowledge of historical               scores, as well as the syntactical conventions that exist in
performance practice1, or specific stop indications com-              different cultures. Our references to Ingold and Foucault
bined with an understanding and knowledge of the in-                  support this need for exploration. Kathleen Coessens calls
strument for which a piece was composed. In the works                 the music score a “coded tool in the arts” and furthermore
for organ and live electronics composed since 1998, the               a score “…is a two-dimensional visual and coded artifact
aural and spatial effect of the processing on the overall             that allows for multiple performances or “resounding
timbral environment is only ‘discovered’ in the space                 processes” by musicians…[and merging] the visual and
after all has been set up. A more specific representation             the musical, the fixed and the dynamic, space and time”
of timbral effect in the score would allow the performers             [5]. These are well-understood concepts, which confirm
to adapt and optimally develop interpretation and tech-               our (Western) cultural understandings of the ontology of
nical set up according to the performance space.                      a musical score. The project is also grounded in non-
    Investigations of timbral descriptions of traditional in-         Western, oral-based paradigms: what does the score (as
struments led us to Ngabut (2003) in Kenyah Bakung                    artifact or ‘thing’) mean within these cultures?
Oral Literature: an Introduction in which the author                      This project will explore the creation of models for the
describes the odeng talang jaran (or jews harp) from the              timbral and performance notation of music, incorporating
Borneo Kalimantan region. The description includes                    both acoustic and electronic sound sources initially work-
detailed descriptions of the instrument’s construction                ing with traditional instruments, then within contempo-
(dimension, materials, and decoration), mode of playing,              rary Western Art Music research through the creation and
                                                                      performance of new musique-mixte and electroacoustic
1
  e.g. Organo Pleno for North German baroque instruments, or
the Tierce en Taille of the Classical French organ tradition.
                                                                220
compositions using these possible models and systems.” 2             Western art music performance and music creation, and
The overall project consists of two conferences, bookend-            the environments of electroacoustic and musique-mixte.
ing three sub-projects that, taken together, provide oppor-          The range of modes of transmission of music and musical
tunities to envision the possibilities and value of timbral          ideas is equally broad – being passed from one generation
notation, aiming to create models from which to develop              to the next, from creator to acoustic and electronic per-
practical performance based scores, which are of value to            formance. Further, it encompasses oral and rote learning,
participants in each area. The project’s co-researchers are          common notation scores to software, and works depend-
practitioners in ethnomusicology, acoustic, electroacous-            ent on the software that was used to create them as a way
tic and musique-mixte as academics, creators and per-                of preserving them. In these notation systems, with the
formers.                                                             exception of the electroacoustic performance software,
   Already queries are arising regarding our ontological             there is no way of describing the quality of imagined
understandings of what comprises a score, and, how it                sound – our ‘frog call’.
functions and communicates, particularly over time. As                   What is notation and what is a score? Both are sepa-
Marc Battier, who presented at the project’s opening                 rate objects, but intertwined with cultural, ontological and
conference in June 2014, observed                                    semiotic inferences, all of which impact the artifact we
                                                                     call the score. In Western art music, a score is an artifact
    “… the preservation of a [musical] score is a big issue,
    and has implications for notation”.                              (often on paper, but perhaps in other media or in soft
                                                                     copy) used to communicate the musical ideas of the
  A score must be in a form which can be understood                  score’s creator to the performer and, with an assumption
and read over long historical time frames, and in a form             of the performer’s active creative input, to the listener. In
which allows long term archival storage and retrieval.               traditional Malaysian music, we can describe the score as,
                                                                     more commonly, a series of memories and traditions,
                   2. THE PROJECT                                    perhaps articulated mnemonically but not, until quite
Research questions have evolved for each sub-project                 recently, written down. In this traditional music, pitch
based on the following investigative parameters:                     and rhythmic inaccuracies that arise from the use of
   1. Can an intuitive notation system for electroacoustic           common practice notation are considerable but, except
music be developed from spectral analysis and spectro-               that they are measured in spectrograms, beyond the scope
morphological representation?                                        of this presentation.
   2. What are the elements that composers, musicolo-                   Our conception of the score as ‘thing’ connects the
gists, performers require from a notation system and how             meaning of the score to Ingold’s theory of ‘correspond-
can these be represented?                                            ence’ [6] drawing us to a significant difference between a
   4. Can spectrographic analysis and software be used to            score and a spectrogram – the spectrogram is an historical
provide a method for defining and identifying unique                 document – ‘this sound was like this’. We can measure
qualities of Malaysian indigenous instruments?                       the sound that happened in this way, and read it as such.
   5. Can this information be used to ‘describe’ and no-             Contrarily, a music score (with its multiplicity of mean-
tate the specific individuality of sounds, materials and             ings) is a ‘thing’ of possibility [7]. It is a crea-
performance methods in ways that expand the range and                tor/composer’s conception of some sounds that, if recre-
musical vocabulary of the ethnomusicologist?                         ated in this or that way by the performer, has the possibil-
   6. What parameters of analysis can be defined to pro-             ity of generating non-verbal ideas and concepts in the
vide useful and universally ‘understood’ symbols using               minds of the performers and listeners. Manuella Black-
spectrographic softwares?                                            burn suggests a new way of using the spectrogram to help
                                                                     generate compositional ideas in her exploration of the
2.1 Issues Arising – a problem statement?                            potential of spectromorphology and its associated lan-
                                                                     guage as a process for composition” [8]. She writes,
This research project is adopting a multi-faceted ap-
proach to exploring the possibilities of creating scores               “… spectromorphology can be approached from an alter-
that describe and notate timbre and which might eventu-                nate angle that views the vocabulary as the informer upon
ally come to some degree of functionality. The practice of             sound material choice and creation. In this reversal, vo-
                                                                       cabulary no longer functions descriptively; instead the
the various co-researchers and the paradigm of their ex-
                                                                       vocabulary precedes the composition, directing the path
perience provide multiple sites and contexts for the re-
                                                                       the composer takes within a piece. This new application
search. These paradigms also encompass the realms of                   is an attempt at systemization and an effort to (partly)
traditional and non-Western music performance, acoustic                remedy the seemingly endless choice of possibilities we
                                                                       are faced with when beginning a new work” [8].
2
 Blackburn (2014),
http://spectronotation.upsi.edu.my
                                                               221
   Blackburn’s suggestion of the use of spectromorphol-                   notation as outlined above is proving a rich potential
ogy as a compositional tool suggests the possibility of                   model in creating gestural notation in the musique-mixte
changing the historic nature of the spectrogram into one                  performance environment. This model is described in
of potential.                                                             greater detail below.
   Other researchers have struggled with many of the is-
sues that have arisen in our individual and collective                    2.2 The Sub-Projects
deliberations. Rob Weale3 in the EARS Glossary of                         This research project is structured with three principal
terms, Spectromorphology notes there is both interde-                     sub-projects, which, though operating in parallel, allow a
pendence and dynamism in the word spectromorphology.                      sequential development of models and notational ideas.
Whist not reducing the historic quality of a spectrogram,                 The applications used to create the spectrographs used in
this is helpful to this project for the conceptual develop-               this project are Pierre Couprie’s eAnalysis [13] and Sonic
ment of a timbral score, as he describes spectromorphol-                  Visualiser [14].
ogy as a tool for “describing and analyzing listening
experience.” He continues: “The two parts of the term                     2.2.1 Project 1 Ethnomusicology Project
refer to the interaction between sound spectra (spectro-)                 The ethnomusicological sub-project, using spectrograms
and the ways they change and are shaped through time (-                   provides traditional music professionals with an objective
morphology). The spectro- cannot exist without the -                      understanding of the nature of the sound quality of spe-
morphology and vice versa: something has to be shaped,                    cific instruments, and the musical or ritual context in
and a shape must have sonic content” [9]. So there is the                 which they prefer to use it. As a music tradition that is
possibility of dynamism in a spectral score.                              oral, transmission of music and pieces is achieved by
   The score, if incorporating some form of spectrogra-                   rote, repetition, and aural memory. This research is not an
phy, will probably contain graphics that also have semiot-                attempt to standardize the sound of instruments. Instead,
ic qualities. Martin Herchenröder, in discussing the score                it adds to the knowledge of the Wayan Kulit artform,
of Ligeti’s graphic score of the organ work Volumina,                     which is presently in a difficult phase. In parts of Malay-
adds musical and performative gesture to the inherent                     sia, including one of its places of origin, Kelantan, it is
quality of a score as he attests                                          banned. University programmes, such as those main-
     “…, it is a coherent system of signs [semiotics], whose              tained by UPSI, are important in the continued artistic
     details can all be translated into musical patterns. A look          viability and vibrancy of Wayan Kulit (Director of Kelan-
     at the third page of the score of Volumina illustrates the           tan Arts and Culture Museum, personal communication
     cluster through visual analogy. The horizontal dimension             in Penny/Blackburn FRGS The Imaginary Space, 2014).
     corresponds to the flowing of time: The time sequence of             This spectrographic process is demonstrating the value of
     musical events (according to the reading habits of the               profiling instruments, allowing makers objective
     western world) is a left-right succession of notes. Thus, in         knowledge of the range of sounds preferred by the musi-
     principle, each event is fixed in time - the new cluster in
                                                                          cians who play and perform.
     the right hand as posits an approximately after 17 sec-
                                                                             The first process within this sub-project has been to
     onds, after another 10 seconds of complete, another 4
     seconds later” [10].                                                 record the sound of, then create spectrograms of, tradi-
                                                                          tional Malaysian Wayang Kulit shadow puppet music
   It has been argued that this gestural quality is also se-              theatre. UPSI maintains a group of resident musicians
miotic and tied to the sonic gesture. The ‘left-right’ suc-               specializing in this musical form. In performance, a group
cession of symbols and their vertical location on the page                of four to six musicians and the master puppeteer are all
indicating pitch (high/low) also has sonic inferences that                located out of sight behind a large translucent screen,
offer potential for developing elements of performance                    which is the stage for the shadow puppets. Our study
notation [11]. Treating the score of Volumina as an xy                    includes an exploration or profiling of sounds preferred
graph for time and pitch, we can see that the evident                     by professional traditional musicians in certain percus-
gestures and sonic shapes are potentially useful in timbral               sion instruments.
notation. It is an area where the left-right and vertical                    The orchestra of the Wayang Kulit Siam (as found in
associations could be helpful in ‘notating’ gestures,                     Kelantan, Malaysia) consists of percussion instruments
which, by their musical outcomes are also timbral.                        including a pair of double-headed drums – gendang, a
O’Callaghan and Eigenfeldt have demonstrated how                          pair of single-headed goblet-shaped drums – gedumbak, a
spectral density can be implied within acoustic and mu-                   pair of vertically standing drums (gedug) hit with beaters,
sique-mixte compositions [12]. Combining colour, which                    hand small cymbals – kesi, a pair of inverted gongs –
can be ascribed various meanings, and graphic, gestural                   canang, and, a pair of hanging knobbed gongs – tetawak.
                                                                          Melodic instruments include the serunai (a double-reed
3
    www.ears.dmu.ac.uk/spip.php?rubrique28                                instrument, similar to the shawm) and a three-string spike
                                                                    222
                                                                     given a visual (spectrographic) or written form, and ap-
                                                                     plied in the other projects?
                                                                        A second strand in this project investigates a ‘Western’
                                                                     facet – the creation of a recorded catalogued of extended
                                                                     flute performance techniques, using a concert flute,
                                                                     which have been spectrographed and analysed for their
                                                                     characteristics. These characteristics are being extracted
                                                                     for the development of a form of spectral representation
                                                                     that can be adapted for use in common notation scores,
 Figure 1. Testing the Gedumbak                                      particularly for acoustic instruments. This strand has been
                                                                     productive, opening ideas and knowledge that leads into
bowed instrument – rebab. The instruments, while indi-
                                                                     the second sub-project, combining acoustic and electroa-
vidually important, gain their true significance in an en-
                                                                     coustic musical contexts in new compositions.
semble and dramatic context. When making recordings of
various instruments, initially it seemed sensible to just            2.2.2 Project 2 Musique Mixte project
record the instrument in a dry unadorned environment.                The musical score as semiotic medium can be understood
However, in order for the Wayang Kulit leader (Pak                   as an “infinite substance” [15] that activates the musi-
Hussain) to make his assessments, the recordings that                cian’s ability to imagine and translate notation into a
ended up being made were of the whole group playing                  temporal unfolding of new knowledge and experience. As
                                                                     we look towards extending performance practices into
                                                                     new conceptual contexts and relationships, new para-
                                                                     digms that reflect and drive new expressions and activi-
                                                                     ties evolve. Timbral notation as a context of change mo-
                                                                     tivates explorations of shifting performative relationships,
                                                                     new ways of thinking and performing, and a reconceptu-
                                                                     alization of the score/performer relationship.
                                                                        This part of the project will create models for spectro-
                                                                     graphic notation as performance scores. Analyses of
 Figure 2. Spectrogram of Wayang Kulit ensemble –segment of          notation, timbre and organology associated with chosen
 recording focused on Gedumbak with strong onset feature.            instruments and electronics (musique mixte) will be un-
                                                                     dertaken to develop a framework for investigating spec-
while testing out the Gedumbak for different dramatic                trographic analyses, evaluations and outcomes. New
environments. Selecting instruments for their suitability            works will generate performance analyses through phe-
in a given drama (normally, the stories are drawn from               nomenologically based studies, following the sound spec-
thirty of so traditional stories) means that the players are         trum and performer responses to new musical works.
more interested in their collective role than the individu-             We question the role of the score as mediator between
al, so the recordings were made to reflect this. The ge-             mind and sound [16]. What information is conveyed
dumbak was close miked, and the rest of the ensemble                 through spectral timbre notation? What are the semiotic
sound was allowed to spill into these microphones. The               implications of sound codification? Is the information
longer red lines in the last section of this short segment           rigid, or a point of departure for the performer? A per-
show the moment when the serunai enters.                             former’s notation needs clarity and embedded knowledge
   Why, for example, is one pair of Gedumbak preferred               or information that directly communicates to them – that
in one piece over another? Spectrograms can show a                   is clear, readable, interpretable, and informative of what
profile of the sound, which may then be attached to a                the music is about. The multiple layers of a spectrograph
musical (or in the case of Wayang Kulit) dramatic con-               emit different levels of information, multiple meanings,
text. Spectrograms further show us that by using different           different streams of representation – all systems that
modes of playing, different timbral qualities can be em-             require understanding and evaluations of the relations of
phasized in the same instrument – brighter or more mel-              the score. What can a performer expect – information of
low and so on. Co-researcher, Mohd. Hassan Abdullah                  spectral density? Aesthetically, a spectrograph is a beauti-
has pointed out that mnemonic forms of teaching and                  ful object – but just how effective and informative is it as
communicating musical content in Malaysian traditional               timbral notation for the performer? Is it instructional, or
music also imply different timbral and gestural modes of             suggestive, gestural, strictly coded or freely interpretable?
playing. So, we ask the question, can this content be                Can a spectrograph be as revealing or evocative as a
                                                                     beautifully notated score? Can it evoke spatialities, mem-
                                                               223
ories, or sonic energies? What is the need for this as nota-                  Can a circle can be drawn around the score as space, and
tion?                                                                         the spectrograph act as facilitator and activator of that
   Investigating the recordings and single frame spectro-                     space? In a recent study of intercultural music perfor-
graphs of the Western flute extended techniques will                          mance in Malaysia4, heterotopia was articulated through
allow us to experiment with the flautist to see how effec-                    the performative lens, the performance as a context for
tive this is in the re-creation of timbres. The form of tim-                  understanding artistic realisation of intercultural
bral representation on which we will focus does not con-                      knowledge and experience. This space was posited as an
sider fundamental pitches or duration, rather an emphasis                     ecology: a set of relationships, the music, the perfor-
of specific overtones. Pitch and duration are indicated                       mance, a symbiosis of elements of the cultures, collabora-
using common musical notation. As a catalogue of                              tions and connections that occur [19].
sounds and acoustic performance techniques, the spectro-                         Digital media tends to handle music as encoded physi-
graphic series (see Figure 3) as a research process model                     cal energy, while the human way of dealing with music is
provides some ways forward to link timbral representa-                        based on beliefs, intentions, interpretations, experience,
tions to scores in a musique-mixte environment.                               evaluations, and significations [20], but the exploration of
                                                                              timbral notational elements and relations might activate
                                                                              questioning and re-assessment of values; the search of
                             Recorded note                                    microstructures might lead to a search for sonic essences
                Flute                      Extended techniques                and deeper self understandings; new dimensions evolve,
                                                                              new ways of thinking and living (performing) result.
                                                                              These questions engage us with discovering the meaning
                 Spectrogram of technique / sound
                                                                              of the music as new dimensions of musical practice open
                                         significant harmonic series
    Features FO and characteristics                                           up.
                                                  highlighted
                                                                        224
notes. This approach allows the retention of score rela-               it possible to take that spectrogram and create an audio
tionships and its potential quality while providing the                file to ‘recreate’ the sounds of the original file.
composer with a means of specifying timbral quality                        A simple outcome (though conceptually complex)
within their score.                                                    would be to take some of the various software packages
   Adapting this approach using graphic notation could                 and have them sonify a spectrogram. Some simple exper-
include the dynamic quality of the spectrogram, which                  imentation with existing software packages, using Audio
can include indications of duration, pitch, relative ampli-            Paint [21] have been undertaken. The results using these
tude and the ASDR envelope. These could be incorpo-                    are not promising. The concept might be helpful in realiz-
rated into a form of notation that may resemble a colour-              ing electroacoustic scores without access to the software
ised version of, for example, Ligeti’s score of Volumina.              used to create it. There are many issues and concerns at
The representation of music in this form might also be                 this juncture, which make this process one for a separate
readable as a type of gestural notation, of pertinence to              and continuing research project, developing and evolving
software instrumental performance, though this is a pro-               models that might be forthcoming from this project.
cess currently being examined in our pieces. This ap-                  Some of the problems lie in impact of the space in which
proach must be considered only a starting point – a model              a sound is being projected and its influence on timbre.
for investigation.                                                     For multichannel electroacoustic works there is the ques-
                                                                       tion of how one will ‘record’ the original sound – as
2.2.4 Electroacoustic Music Project
                                                                       separate channels with individual spectrograms, which
The third sub-project is an exploration of the use of spec-            might then be reconstructed? Combined with the possibil-
trograms to create a form of timbral notation, which                   ities of the models outlined above, and acknowledging
could be used in electroacoustic compositions as a way of              the many complexities, is a worthy goal to gain the abil-
preserving the music independently of the soft-                        ity to recreate fixed works long after the original software
ware/hardware used to create them. As noted earlier,                   or hardware that created it is lost.
finding a mode of preserving a score is a major concern.
One possible approach, and which culturally locates this                                4. CONCLUSIONS
research in South East Asia, is an exploration of the po-
tential of adapting ‘Uthmani’ notation used in Qurannic                Our research to date seems to allow an optimistic attitude
recitation as a form of timbral or gestural notation. This             that spectrograms can be used as the basis of a timbral
exploration is not based around content, but is focussed               notation. The cultural significance of the score as an
on the context of how ‘Uthmani’ is used, written and                   artifact and the relationships it implies – from compos-
recited ‘through sound’. Hasnizam Wahid from UniMAS                    er/creator to performer to listener – must be accounted for
– Sarawak, and one of Malaysia’s leading electroacoustic               in any new notation practices that develop to allow for
composers, is particularly focusing on this area. This                 specific timbral elements demanded by the composer.
project is yet to begin as the first two projects are creating         Our suggestion within instrumental contexts of a rainbow
many of the fundamental bases that must first be estab-                spectrum adds a new layer of complexity to the score, but
lished. It is anticipated that this detailed research will             we assert this enriches the various relationships estab-
begin in July 2015, continuing until the end of the year.              lished within the score’s environs. The model of gestural
                                                                       notation appears to have the potential to provide a techni-
             3. FUTURE PATHWAYS                                        cally workable yet semiotically rich notational ontology,
                                                                       which will provide the basis for investigation in the elec-
Having identified some of the possible pathways for                    troacoustic/acousmatic context. In this sub-project, it is
finding models of spectrographic or timbral representa-                predicted that what Smalley describes as the discrimina-
tion in a score, this section suggests directions that this            tion of “…the incidental from the functional” [22] will be
research might follow. They are not presented in order of              major areas of consideration. In many ways, findings
preference or significance, but remain possibilities that              relating to this project are the posing of more questions.
address the outcomes of the research so far, outcomes yet              Nevertheless, some elements of what will develop into
to be realised and issues and meeting challenges so far                models of timbral notation are suggesting themselves to
identified.                                                            the research group.
   If one were to wish for a software, and we will look at
supporting software development in later research phases,
it would be along the lines of a reverse-action of spectro-
graphic software – i.e. a program such as eAnalysis cur-
rently takes an audio file and from it creates an image: is
                                                                 225
Acknowledgments                                                         29/01/2015 on
                                                                        http://ears.pierrecouprie.fr/spip.p
The authors would like to acknowledge the support of the
                                                                        hp?rubrique28
following people and centres:
   Co-researchers: Associate Professor Dr Mohd. Hassan             [10] M. Herchenröder (1999) Struktur und assoziation.
Abdullah, Associate Professor Dr Hasnizam Wahid, As-                    György Ligetis Orgelwerke. Schönau an der
sociate Professor Dr Valerie Ross.                                      Triesting. Wien: Edition Lade, pp. 62-63.
   Professor Marc Battier – Université de Paris (Sor-              [11] A. Blackburn (2013) “Sourcing gesture and meaning
bonne)                                                                  from a music score: Communicating techniques in
   The Research Management and Innovation Centre at                     music teaching” in Journal of Research, Policy &
Universiti Pendidikan Sultan Idris, and its Director Asso-              Practice of Teachers and Teacher Education, Vol 3,
ciate Professor Dr Tajul Said.                                          No 1, pp. 58 – 68, p.60.
   Malaysian Ministry of Higher Education and the Fun-
damental Research Grant Scheme.                                    [12] J. O’Callaghan & A. Eigenfeldt (2010) “Gesture
   Research Assistant, Hafifi Mokhtar.                                  transformation through electronics in the music of
                                                                        Kaija Saariaho” in Proceedings of the Seventh
                    5. REFERENCES                                       Electroacoustic Music Studies Network Conference
                                                                        Shanghai, 21-24 June 2010 www.ems-network.org.
[1] D. Smalley (1994) ”Defining Timbre - Refining                       2010.
    Timbre” in Contemporary Music Review 10 (2)
                                                                   [13] P. Couprie, eAnalysis
    pp.35 – 48.
                                                                        http://logiciels.pierrecouprie.fr/?
[2] L. Thoresen     (2001/4)     Spectromorphological                   page_id=402
    Analysis of Sound Objects. An Adaptation of Pierre
                                                                   [14] http://www.sonicvisualiser.org
    Schaeffer’s Typomorphology. The Norwegian
    Academy of Music p. 2.                                         [15] D. Barenboim, (2009) Everything is Connected.
                                                                        London: Phoenix.
[3] C.Y. Ngabut (2003) Kenyah Bakung an oral
    literature:An introduction. Last accessed 20/01/2015           [16] M, Leman (2008) Embodied Music Cognition and
    http://www.cifor.org/publications/p                                 Mediation Technology. Cambridge, MA: The MIT
    df_files/Books/social_science/Socia                                 Press.
    lScience-chapter12-end.pdf
                                                                   [17] D.G. Bhalke, C.B. Ramo Rao, D.S. Bormane,
[4] U. Will The magic wand of Ethnomusicology. Re-                      M. Vibhute (2011) “Spectrogram based Musical
    thinking notation and its application in music                      Instrument Identification Using Hidden Markov
    analyses.         Accessed           20/01/2015                     Model (HMM) for Monographic and Polyphonic
    http://music9.net/download.php?id=6                                 Music Signals” in ACTA Technica Napocensis
    039                                                                 Electronics and Telecommunications Vol 52, No 12.
[5] K. Coessens (2014) “The Score beyond Music” in P.              [18] Y. Harris (2014) Score as Relationships: From
    de Assis, W. Brooks, K. Coessens Sound and Score:                   Scores to Score Spaces to Scorescapes, in Sound and
    Essays on Sound, Score and Notation. Leuven                         Score: Essays on Sound, Score and Notation. Ghent:
    University Press: Ghent. p. 178.                                    Leuven University Press.
[6] T. Ingold (2008) Bringing Things to Life: Creative             [19] J. Penny (2015 upcoming) “The Mediated space:
    Entanglements in a World of Materials accessed                      Voices of interculturalism in music for flute” in
    25/01/2015                                                          Routledge International Handbook of Intercultural
    http://www.reallifemethods.ac.uk/ev                                 Research. Eds P. Burnard, K. Powell, E. Mackinlay.
    ents/vitalsigns/programme/documents                                 Routledge: Abingdon
    /vital-signs-ingold-bringing-
    things-to-life.pdf.                                            [20] M. Leman (ibid.)
                                                             226
   COMPOSING WITH GRAPHICS : REVEALING THE COMPOSI-
        TIONAL PROCESS THROUGH PERFORMANCE
                                                                 Pedro Rebelo
                                                           Sonic Arts Research Centre
                                                           Queen’s University Belfast
                                                           [email protected]
                                                                          227
ic choices to the sound result arguably come to the fore-          very function of a score as a symbol for ‘the work’ is in
ground in non-instructional graphic scores. In this paper          many instances also problematized with graphic scores.
we are particularly concerned with the qualities and char-         In her discussion of Cardew’s Treatise, Virginia Ander-
acteristics of this decision making process and how they           son discusses the function of a score and what it repre-
relate to the act of composing with graphics. In order to          sents for Cardew in contrast to Stockhausen (to whom
articulate this relationship we will begin not with the            Cardew was an assistant).
compositional process or intention but rather with a re-
                                                                     “For Stockhausen, the performance is made in his service;
flection on the dynamics of trust and engagement at the              the piece remains his and the performers should divine his
point when a performer decides to work with a non-                   intention even when it is not written down. For Cardew,
instructional graphic score. Two distinct situations can             the score is the responsibility of the performers once it is
                                                                     composed.” [4]
occur which have a significant impact on subsequent
performance preparation. This has to do with whether                  This performer responsibility is exactly what we want
performer and composer are in communication with each              to address through reflecting on the unspoken rules that
other or not. In the first case, it is not uncommon for            emerge from any kind of music making. In the case of
performers to need assurance that there is indeed no in-           Cardew, his Scratch Orchestra (1962-72), set up to per-
terpretative code behind the score. The assumption, even           form his other iconic work – The Great Learning – stands
for performers who are accustomed with graphic scores,             as a group of collaborators who commit to a rather specif-
seems to be that the score is a mediator for a musical             ic ideology of music making and therefore share an ap-
structure that pre-exists in the composer’s mind. A situa-         proach to music which no doubt determines how the
tion in which performer and composer are not in commu-             work with graphic scores unfolds. Cardew notably lays
nication is perhaps more illustrative of the process of            out his vision of social and musical dynamics in A
performance preparation of these kind of works, seen as            Scratch Orchestra : draft constitution :
the performer arguably gains full autonomy. We will                  “A Scratch Orchestra is a large number of enthusiasts
address three aspects, which determine how a score is                pooling their resources (not primarily material resources)
transformed from a static document into an enabler for               and assembling for action (musicmaking, performance,
                                                                     edification).” [5]
music performance in a creative ecology evolving musi-
cians, instruments, venues, audiences etc... These three              As with any music tradition, non-instructional graphic
aspects focus on 1. cultural context and performance               scores carry with them conventions and agency, which
practice traditions, 2. relative connections/mappings              relate to how a specific performance lineage develops. As
between graphical and musical languages from the per-              such, an understanding of this lineage becomes an im-
spective of texture and gesture, and 3. the emergence of           portant element in approaching graphic scores. Perfor-
form as a derivation of the score’s ability to frame musi-         mance practice itself influences how a particular score is
cal time.                                                          used.
                                                             228
when performing a score, free improvisation is not the               compositional process and the making of the score itself.
primary mode of engagement.                                          This process is driven by musical thinking of varying
   Without a code but still with the notion that the score           degrees of determinacy (i.e. more or less precise musical
governs the music, the graphic elements inevitably sug-              structures). It is also guided by a relationship with nota-
gest a process of mapping, a set of relationships between            tion as material, its affordances and conditions. The ways
the language of the graphics and a musical language                  in which different types of notation strategies enable
(which is invariably situated in a particular performance            composers to operate directly on musical elements to the
practice as discussed above). This mapping can take the              extent that to compose and to notate can be seen as the
form of literal association (dense graphics – dense musi-            same action, has been discussed elsewhere [9]. In order to
cal texture, graphical weight – musical dynamics, quali-             better articulate this revealing of the compositional pro-
ties of lines and shapes – musical gestures) or more for-            cess we will refer to the work Cipher Series as an exam-
malised and codified strategies. In any case, the perform-           ple.
er is faced with deciding on how this mapping will occur;
                                                                       “Cipher Series is a collection of graphic scores that are
either for a particular performance or a deliberate codifi-            displayed to audience and performers in accordance to a
cation for a score to be repeated over multiple perfor-                fixed temporal structure generated for each performance.
mances. In contrast to the work conducted in the area of               The performance plays on the role of notation as a media-
                                                                       tor of listening, setting up a performative condition based
parameter mapping in computer systems [7], the type of                 on interpretative strategies based on engagement by both
mapping discussed here is relatively unexplored. The                   the performer and the audience. The change from one
mapping processes at question here implicate both mul-                 graphic score to the next has immediate formal implica-
                                                                       tions for the music and acts as a way of articulating shifts
timodal perception, as explored in fields such as visual
                                                                       in musical material or interpretation strategy.” From Ci-
music [8], and musical practices and conventions, which                pher Series’ performance notes (Rebelo, 2010)
range from cartoon gestural symbiosis in the music of
                                                                        As can be seen in the images below, Cipher Series
Carl Stalling to mathematical translation of curves and
                                                                     employs line drawing (created by hand on a graphics
textures in the work of Iannis Xenakis.
                                                                     tablet and vector graphics software) in a black and white
                                                                     paginated format. The score is a collection of pages, to be
 4. EXTRACTING STRUCTURE AND MUSICAL
                  FORM                                               played independently or in sequence. The most common
                                                                     performance format is a pre-determined timed sequence
An element that is pervasive in the act of engaging with             for seven pages. Each page has a pre-determined duration
scores of any sort is the realisation of musical structure           between 40 and 90 seconds and the transition between
and form. This is partly to do with the relationship be-             pages is cued by a 10 second countdown. In this version
tween music, as an ephemeral time-based phenomena and                of the work, the sequence is run twice. In the first itera-
the physical score as an outside time artifact representing          tion, the beginning 30 seconds from each page are rec-
a sequence of events that can be seen at a glance. From              orded and then played back during the second. The sound
the layout of the page to the palette of graphic elements            projection of this playback is intended to be placed as
employed in a score, a sense of structure is inevitably              close as possible to the instrument (e.g. loudspeaker in-
conveyed through framing (page layout, margins, rela-                side the piano body) in order to expose the ambiguity of
tionship between pages) and placement of discrete ele-               what is live and what is pre-recorded. By exposing a
ments (shape, colour, scale, repetition). It is in this do-          specific graphics-sound relationship twice we explore the
main that the compositional process is revealed. This                very nature of mapping and interpretation. The moment a
happens as a process that shifts an understanding of a               recording is triggered projecting the sound events made
graphic score as a visual object to a musical one. An                when that same graphic score first appeared, the perform-
object which is made to speak the same language as all               er is faced with the decision of whether to imitate her
other elements of music making: the relativist language              previous interpretation, complement it or indeed do
of ‘louder than’, ‘same as before’, ‘more dense’, ‘higher’,          something entirely different. The score of Cipher Series
‘lower’, ‘slower’, ‘faster’ etc… This relativism is particu-         was conceived for audience display, which further expos-
larly pronounced as performers face a score, which clear-            es the decision-making process. By displaying the score
ly contains musical information but no code to produce               the performer is following (without the cued countdown
instructions. All decisions are then made from the score             that triggers a change of page) the audience is also invited
and in relation to the score.                                        to derive their own mappings and musical structures.
                                                                        The layout of Cipher Series on the page follows a
          5. REVEALING COMPOSITION
                                                                     number of conventions, which are apparent without the
The three aspects at play when preparing a graphic score             need for rules on interpretation. These include the land-
for performance as discussed above gradually reveal the              scape layout with orientation determined by legend at the
                                                               229
bottom right corner. This mode of presentation suggests
left to right reading although this is not specified. Each
page presents a self contained musical sequence of events
which can be played once or more times given a specific
duration. A number of pages have relatively complex and
detailed graphics, at times resembling eastern calligraphy.
The density of events makes it practically impossible to
engage in a “one-to-one” gestural mapping (i.e. one visu-
al stroke determining one musical gesture) much as in
Applebaum’s Metaphysics of Notation. This is a deliber-
ate attempt to invite the performer to engage with the
score in ways other than scanning though events at a
regular pace. In fact, in my own performances of the
score I often focus on sub-sections of the page for repeti-
                                                                    Figure 3. Trio, p. 1 (Rebelo, 2010)
tion.
   The most apparent compositional strategy employed                                      6. CONCLUSIONS
here is perhaps the modular approach to the page as a
                                                                    By focusing on a type of graphic score practice that is
frame for musical activity. In this context the transitions
                                                                    deliberately un-codified and not based on the delivery of
from page to page articulate the most striking musical
                                                                    instructions for performance, this paper articulates the
changes. Even without a process of codification a per-
                                                                    dynamics at play during the process of performance prep-
former preparing such a score will respond to the change
                                                                    aration. We argue that the autonomy transferred to the
of scale and texture evident in the difference between
                                                                    performer, or to be more precise, to the performance
page 1 and page 2 below.
                                                                    condition, is an act that reveals the compositional think-
                                                                    ing behind a work. By bringing meaning into a score, a
                                                                    performer is following a roadmap created by a composer
                                                                    but deciding on how the journey is to unfold. The score
                                                                    as a roadmap gains the function of a document establish-
                                                                    ing musical circumstances, which within a performance
                                                                    practice become one of many elements determining the
                                                                    making of music. Composing with graphics ultimately
                                                                    reflects a desire to see the score not as the embodiment of
                                                                    “the work” but rather as a working document which only
                                                                    comes to live in the social workings of music making.
Figure 1. Cipher Series, p. 1 (Rebelo, 2010)
                                                                    Acknowledgments
                                                                    The insights discussed here derive from the experience of
                                                                    performing graphic scores with musicians such as Evan
                                                                    Parker, Elizabeth Harnick, John Eckhardt, Steve Davis,
                                                                    Franziska Schroeder, Basak Dilara Ozdemir, Ensemble
Figure 2. Cipher Series, p. 2 (Rebelo, 2010)                        Adapter, Ricardo Jacinto. Thank you all for you musical
                                                                    generosity.
    Cipher Series was the first in a sequence of works that
share this type of graphical language (Quando eu nasci,                                    7. REFERENCES
and Trio both from 2011). These later works are designed
for ensembles and develop the language to reflect a sense           [1] Wadle, Douglas C. "Meaningful scribbles: an ap-
of musical parts, which inhabit the same. In Trio a simple              proach to textual analysis of unconventional musical
colour scheme assigns each performer to a part while all                notations." Journal of Music and Meaning 9 (2010).
other elements of the score remain non-instructional.                   Pritchett, James. The Music of John Cage. Vol. 5.
Compositional strategies here reveal themselves also in                 Cambridge University Press, 1996.
the way the three parts relate to each other. Relationships         [2] Small, Christopher. Musicking: The Meanings of
of accompaniment, continuation, counterpoint, synchro-                  Performing and Listening. Wesleyan University
nisation can be derived from the score to inform musical                Press, 1998.
performance.                                                        [3] Anderson, Virginia. “‘Well, It’s a Vertebrate …’:
                                                                        Performer Choice in Cardew’s Treatise.” Journal of
                                                              230
      Musicological Research 25, no. 3–4 (December 1,
      2006): 291–317.
[4]   Cardew, Cornelius. “A Scratch Orchestra: Draft
      Constitution.” The Musical Times 110, no. 1516
      (June 1, 1969): 617–19.
[5]   Cardew, Cornelius (1971). Treatise handbook, in-
      cluding Bun no. 2 and Volo solo. London, Edition
      Peters.
[6]   Hunt, Andy, Marcelo Wanderley, and Ross Kirk.
      "Towards a model for instrumental mapping in ex-
      pert musical interaction." Proceedings of the 2000
      International Computer Music Conference. 2000.
[7]   Evans, Brian. "Foundations of a visual music."
      Computer Music Journal 29.4 (2005): 11-24.
[8]   Rebelo, Pedro. “Notating the Unpredictable.” Con-
      temporary Music Review 29, no. 1 (February 2010):
      17–27.
                                                           231
    ACCESS TO MUSICAL INFORMATION FOR BLIND PEOPLE
                                                         Nadine Baptiste-Jessel
                                                              IRIT-UT2
                                                         [email protected]
                             ABSTRACT
                                                                             2. BRAILLE MUSIC PRINCIPLE
In this paper we describe our approach to helping blind
people access musical information. Guidelines of our                         The rules used to create a Braille music score are present-
approach are centered on information accessibility ac-                       ed in the New International Manual of Braille Musical
cording to user disability. We present the process which                     Notation compiled by Betty Krolick [5]. It is important to
allows musical information to be coded and converted so                      note that, just as with conventional musical notation, this
that it may be read, played and analysed by a blind musi-                    is an international code and so it is possible to exchange
cian. We focus our approach on the various levels of                         Braille scores between different countries. To explain the
description of the score done by several codes and we                        challenges involved in learning Braille music we divide
exploit and describe existing results like BMML (Braille                     the rules into three types: the simple rules, the presenta-
Music Markup Language) defined during Contrapunctus                          tion rules and the contraction rules. In this chapter we
European project. We describe and comment on different                       also describe BMML (Braille music markup language).
scenarios using existing free conversion modules and
software to obtain a score in BMML that may be read and                      2.1 The simple rules
manipulated by blind people using BMR (Braille Music                         These are the rules used to transform music information
Reader). We recommend the tutorials created during the                       into one or more Braille characters.
Music4VIP European project.                                                     For example: the G clef is indicated by three Braille
1. INTRODUCTION
                                                                             characters:   >/l        - although this information is not so
                                                                             important in Braille because the octave signs, rather than
Some IT solutions exist to help blind people to access                       clefs on a staff, indicate the register of specific pitches in
music, but analysis of these reveals both their utility and                  Braille music.
their limits. As Antonio Quatraro (blind musician) says,                        The octave sign is placed immediately before the note,
there are many factors which hinder the musical educa-
tion of blind people - the lack of special needs training of
                                                                             for example the 4th octave mark is    _
teacher in mainstream schools and conservatoires, the                           The name of a note is indicated by the four upper dots
difficulty of finding music scores in an accessible format                   of a Braille character :
and the persistent idea that music can be only learnt by
ear.
   Compared with existing methods of converting music
into Braille like [1] and [2] our solution is based on the
design of BMML (Braille Music Markup Language) [3].
To explain the process we first describe the principles of
                                                                             Table 1. The Braille Name of note.
Braille music, in the next part the tools and code used to
translate a score into an accessible format and in the final                    The duration is indicated by the two lower dots, as
part we recommend the use of BME2 (Braille Music                             shown below.
Editor) and BMR (Braille Music Reader) [4].                                     Whole notes and 16ths  -
Copyright: © 2015 Nadine Baptiste-Jessel. This is an open-access                Half notes and 32nds '
article distributed under the terms of the Creative Commons                     Quarter notes and 64ths ,
Attribution License 3.0 Unported, which permits unre-
stricted use, distribution, and reproduction in any medium, provided            Eighth notes and 128ths ·
the original author and source are credited.
                                                                       232
   So a short simple score will be transcribed:                      Example of character reduction :
Figure 2. A simple score in Braille.                                    When the same interval appears several times the first
                                                                     interval sign is doubled and then one interval sign is
2.2 Presentation rules                                               placed at the end.
Two presentations exist for keyboard instruments or other               To store all the Braille information we created the
ensembles: bar over bar and section by section.                      BMML code during the Contrapunctus project.
    Bar over bar presentation presents a Braille line for
each stave and the first note of each bar appears in paral-          2.4 BMML
lel.                                                                 BMML code was designed with following goals:
    Section by section presentation presents a number of                     to encode Braille structure and content as de-
bars for one stave followed by a number of bars from the                         fined [3],
other.                                                                       to facilitate conversion from and to other mu-
    These different presentation rules are available for all                     sic notation encoding such as MusicXML [4],
the score.                                                                   flexibility to support different Braille music
    Other presentation rules exist to add in the Braille                         dialects.
score the corresponding print page number to facilitate                 The grammar of BMML is specified in [3] .Very brief-
collaboration with sighted musicians.                                ly we can say that the BMML elements are of three types:
                                                                             a specific header in which is encoded all data
2.3 Contraction rules                                                            relating to the document archiving and its
There are two types of contraction rules, dot reduction                          structure,
and character reduction. These different rules reduce the                    container elements which require a specific
reading time and the number of pages in a Braille score.                         number of “children”. A child can be another
The rules are designed to help reader with a good                                container or a text element,
knowledge of Braille music.                                                  text elements which represent the Braille text
   Example of dot reduction:                                                     coded in Unicode.
                                                                        BMML attributes are used to encode the meaning of
                                                                     each text element. A lot of them are required.
                                                                        The following paragraph shows an very simple exam-
                                                                     ple of BMML but BMML can support more complex
                                                                     notation (tuplets, ornements, …) which permits its use by
                                                                     professional musicians.
                                                               233
http://imslp.org/wiki/Notebooks_for_An                               The note information with pitch, duration and octave
na_Magdalena_Bach_%28Bach,_Johann_Seba                            signs is similar to the information in Braille.
stian%29
                                                            234
   The BMML file obtained is shown below.                           differs – all of which proves that the layout information
                                                                    is missing in the code.
                                                              235
   The time signature is the same in both applications but          will be immediately readable by a sighted musician and
is not the same as the pdf file. This is an important prob-         the minor formatting issues can be tidied up in a few
lem because it implies a different meaning of the music.            minutes. This is enormously valuable for collaboration
If we convert this file into BMML the musical infor-                between sighted and blind musicians, whether they be
mation is very different from that obtained with the con-           teachers, students or members of a musical ensemble.
version of a pdf file. MIDI files have to be used with                 The way a blind person may access and make music
great care because they do not contain important infor-             without external aid is explained and demonstrated in the
mation like fingering, slurs or ties.                               video found at :
   In general, it will be of benefit to download digital            http://www.music4vip.org/video_lesson_
scores from a reliable site like a library site. Having ob-         item/7.
tained a BMML file from whichever source a blind per-
son can manage the score with a Braille reader or editor.           5. CONCLUSION
We describe this process in the following section.
                                                                    This paper describes how a blind user can access, con-
                                                                    vert, read and write musical scores. The conversion mod-
4. THE READER OR EDITOR USED BY BLIND
                                                                    ules plus reading and editing tools are free, accessible and
PEOPLE
                                                                    based on the BMML code. To obtain an available score
BMR is free software which permits blind users to read,             in Braille it is necessary to convert a score into Mu-
learn and listen to music in a multimodal environment.              sicXML format produced by an official editor or library.
Each piece of musical information can be accessed in                To facilitate the collaboration between sighted and blind
Braille on a refreshable Braille display or by sound via            musicians a reader with musical notation and Braille
MIDI or in a spoken form.                                           windows will be designed.
   For a beginner, different kinds of Braille music ele-
                                                                    Acknowledgments
ments may temporarily be hidden or a brief description of
an unknown sign can be given.                                       Part of this work has been undertaken in the European
   With BMR the user can browse the score, add annota-              project Music4VIP. I wish to thank all the partners in the
tions, find parts and bars and skip through the score along         project.
hierarchical elements. He can, like a sighted person, have
access to all the information contained in the score.               6. REFERENCES
   In the status bar of BMR we can read the musical in-             [1] D. Goto,T. Gotoh, R. Minamikawa-Tachino
formation which corresponds to the Braille character                    and N. Tamura, “A Transcription System from
which is after the cursor.                                              MusicXML Format to Braille Music Notation”,
                                                                        EURASIP Journal on Advances in Signal
                                                                        Processing, Volume 2007, Article ID 42498.
                                                                    [2] http://www.dancingdots.com/main/
                                                                        goodfeel.htm (consulted 01/15/2015)
                                                                    [3] B. Encelle, N. Jessel, J. Mothe, B. Ralalason
                                                                        and J. Asensio, “BMML : Braille Music
                                                                        Markup Language”, in The Open Information
                                                                        Systems Journal, 2009, pp. 123-35.
                                                                    [4]http://www.music4vip.org/braille
                                                                        _music_reader (consulted 01/15/2015)
                                                                    [5] Bettye Krolick, New International Manual of
                                                                        Braille Musical Notation, World Blind Union,
Figure 15.The Braille score in BMR
                                                                        1996
   With BME2 [7] the same functionalities are available             [6] Michael Good, January 13, 2015, MusicXML
but, in addition, the user can write musical information in             Definition version 2.0,
Braille. So users can create their own scores and produce               http://www.musicxml.com/for-
BMML files. With the conversion module they can create
                                                                        developers/musicxml-
MusicXML files and share them with sighted musicians.
                                                                        xslt/musicxml-2-0-to-1-1/
Of course, the result in graphic form will not be so well
                                                                    [7]http://www.veia.it/en/bme2_produ
laid out as it would be if it had been originally produced
in a conventional music editing application but the score               ct2015
                                                              236
  NON-OVERLAPPING, TIME-COHERENT VISUALISATION OF ACTION
    COMMANDS IN THE ASCOGRAPH INTERACTIVE MUSIC USER
                        INTERFACE
                Grigore Burloiu
                                                                                                      Arshia Cont
       University Politehnica of Bucharest
                                                                                                   MuTant Team-Project
   Faculty of Electronics, Telecommunications
                                                                                           IRCAM STMS UMR, CNRS, INRIA, UPMC
          and Information Technology
                                                                                                    [email protected]
           [email protected]
                                                                           237
                                                                        ping automation curves (e.g. the one in Group ONE) are
                                                                        involved.
                                                                          In order to rectify this loss of coherence and clarity, the
                                                                        need arises to stack action groups in downward non-over-
                                                                        lapping order, similarly to how elements within groups are
                                                                        arranged. As the challenge becomes one of efficient man-
                                                                        agement of 2D space, it is useful to describe it as a two-
Figure 1. The ”classic” action block display in AscoGraph. The
musical timespans of Groups ONE, TWO and THREE are not                  dimensional strip packing problem. A subset of bin pack-
represented clearly because of the overlapping of group blocks.         ing, strip packing is used in areas ranging from optimizing
                                                                        cloth fabric usage to multiprocessor scheduling [4]. Algo-
                                                                        rithms seek to arrange a set of rectangles within a 2D space
actions section. The instrumental view and electronic view
                                                                        of fixed width and bottom base, and of infinite height. In
are coupled in musical time along a common horizontal
                                                                        our present case, the width of the strip corresponds to the
timeline. During a performance, Antescofo’s score fol-
                                                                        total duration of the piece, and the rectangles to be placed
lower determines the position of AscoGraph’s graphical
                                                                        are the action group blocks.
cursor along the timeline.
                                                                          A particular constraint separates our problem from the
  This paper presents updates to AscoGraph’s electronic
                                                                        rest of the bin packing literature. Unlike in existing bin
action view, developed with two directions in mind: (1) a
                                                                        packing problems, all AscoGraph action blocks must retain
clearer and more time-coherent visualization of Antescofo
                                                                        their X coordinate along the time axis. Since we are not
scores, and (2) a step towards a complete, self-contained
                                                                        allowed to ”nudge” blocks horizontally, relying on existing
visual notation format for mixed music scores. Section 2
                                                                        packing algorithms becomes impractical.
presents the problem of overlapping action blocks, recast
as a subset of the two-dimensional strip packing problem.
The following section shows the three proposed algorithms                            3. PACKING ALGORITHMS
for re-arranging action blocks. Section 4 tackles the issue             We introduce three new algorithms for stacked action group
of coherence between block width and musical time. We                   display in AscoGraph’s graphical editor. The user can
conclude the paper with an evaluation of the present model              switch between one of them and the original display style
and future perspectives.                                                through the application’s View menu. The appropriate op-
                                                                        tion will depend on score complexity and the user’s per-
              2. PROBLEM DEFINITION                                     sonal taste.
                                                                          Please note: following bin packing convention, we shall
We distinguish between physical time (measured in sec-
                                                                        consider the rectangles as being placed on top of the strip
onds) and musical time (measured in beats). The amount
                                                                        base. Naturally, in the AscoGraph environment the situa-
of physical time elapsed between actions depends on the
                                                                        tion is mirrored and we build downwards starting from the
tempo detected during performance, and on the active syn-
                                                                        upper border.
chronisation strategies [2]. Meanwhile, Antescofo scores
are specified in musical time. Since AscoGraph was pri-                 3.1 First Fit (FF)
marily designed as a score visualisation tool, it employs a
musical timeline. When a physical time unit is specified in             The first option is the trivial solution of placing the blocks
a score (e.g. ”after 2 seconds”), in order to display it Asco-          in the first space they will fit, starting from the base. The
Graph must first translate it to an ideal musical time (e.g.            benefits of this option are speed and predictability: blocks
”after 4 beats at 120bpm”).                                             are placed in the order in which they appear in the source
  Fig. 1 shows an example of the original AscoGraph ac-                 code text, which is also their scheduled temporal order.
tion block arrangement style. Here, each of the four notes                The downside can be intuited from Fig. 2a and b. We
(drawn in red in the instrumental view’s piano roll) has one            propose a worst-case scenario: a set of blocks with increas-
corresponding action group block. Durations can – and of-               ing heights and, for simplicity, all equal widths. While FF
ten do – differ between the length of a note and that of its            would stack them on top of each other (Fig. 2a), the opti-
associated electronic actions. While actions within a sin-              mal method would stack them two by two (Fig. 2b), so that
gle group (e.g. Group FOUR) are stacked consecutively                   the maximum height is given just by the final two elements.
downwards, when two different action groups are partially
                                                                        3.2 First Fit Decreasing (FFD)
concurrent, the second group is drawn over of the first.
Consequently, the first group’s duration is no longer clearly           Note that in the previous case, the optimal configuration
shown; things become even more confusing when overlap-                  can be reached by simply reordering the blocks by height.
                                                                  238
                 (a) FF                   (b) OPT (FFD)
This insight lies at the root of the classic FFDH strip pack-
ing algorithm [5]. In our case, the FFD algorithm orders
the blocks by non-increasing height, after which the First
Fit process is applied. 1 Fig. 3a shows an FFD arrange-
ment, along with the optimal solution at Fig. 3b.
                                                                           239
  4. TIME COHERENCE OF ATOMIC ACTIONS
     5. CONCLUSIONS AND FUTURE WORK                                      (b) new model: blocks are stacked, messages are grouped
                                                                         in time-coherent points
We have shown an improved layout mechanism for elec-
                                                                      Figure 6. Comparison of old and new AscoGraph models over a
tronic action groups over a musical timeline in AscoGraph.
                                                                      complex score.
By stacking action group blocks we ensure information in-
tegrity and coherence, while expanding the vertical real es-
tate used. The most basic stacking method, First Fit, is              options were deemed appropriate for the practical use and
also the most easily readable option for scores of mod-               the processing overhead of the AscoGraph software.
erate depth. We also proposed two increasingly efficient
                                                                        Finally, we have introduced a method of displaying re-
stacking algorithms, FFD and FFDT, for scores contain-
                                                                      lated messages on a single line which preserves group hier-
ing larger concentrations of actions per time unit. While
                                                                      archy. The main advantages are time coherence and verti-
superior algorithms are technically conceivable (possibly
                                                                      cal compactness. Still, this model can be seen as a compro-
a metaheuristic scheme built on top of FFDT), the present
                                                                      mise in our quest for a completely specified, self-contained
                                                                      visual notation format which we proposed in the introduc-
                                                                      tion. Dynamic constructs from the Antescofo language are
                                                                      in a similar situation. For instance, a Curve whose dura-
                                                                      tion is a dynamic variable: in this case, AscoGraph cannot
                                                                      know its exact plot over time before execution.
                                                                       Therefore, one direction of future research is a perfor-
       Figure 5. Time-coherent message circles display                mance simulation mode, decoupled from the compositional
                                                                240
display described thus far, in which all messages, Loops
and other dynamic constructs are represented as they ”hap-
pen” in an offline simulation. This function is currently in
prototype form, having been first described in [3].
  However, the need remains for a graphic compositional
model that clearly describes dynamic behaviour and action
results. With the growing crystallisation of Antescofo’s
language into a mature, stable package, the path is now
open for research in this direction.
                   6. REFERENCES
[1] A. Cont, “Antescofo: Anticipatory Synchronization
    and Control of Interactive Parameters in Computer
    Music.” in International Computer Music Conference
    (ICMC), Belfast, Ireland, Aug. 2008, pp. 33–40.
    [Online]. Available: http://hal.inria.fr/hal-00694803
                                                               241
   DYNAMIC NOTATION – A SOLUTION TO THE CONUNDRUM
           OF NON-STANDARD MUSIC PRACTICE
                                                              Georg Hajdu
                                               Center for Microtonal Music and Multimedia
                                                Hamburg University of Music and Theater
                                                 [email protected]
                                                                         242
but fails bitterly at non-octave tunings such as the                 system capable of changing views in real-time can vastly
Bohlen-Pierce scale. Equidistant notations, such as the              simplify the process of creating scores and parts.
Hauer-Steffens notation [11] have the advantage that
transpositions and transformations of tone gestalts be-                             3. IMPLEMENTATION
come evident, but have been rejected in history because              A plugin structure for dynamic switching between nota-
of cultural and economic implications, and most likely               tion styles has been implemented for the MaxScore Edi-
also because of the cognitive mismatch between notation              tor. MaxScore is a Max Java object designed and main-
and the piano layout with its black and white keys. There            tained by Nick Didkovsky since 2007 to bring music
is no reason, though, to shy away from introducing equi-             notation to the Max environment [12].
distant notation for tunings other than 12EDO (and its                  Since 2010 the author is developing an editor, which
related, circle-of-fifths-based tunings). We dub this ap-            also interfaces with Ableton Live via Max for Live. As
proach logical notation.                                             opposed to the Bach notation objects, which—being
   A conductor, finally, has different concerns as he/she            native Max externals—provide better integration in the
needs to grasp the meaning of the different notation styles          Max environment, the MaxScore Editor is based on a
used in rehearsals and performances. A conductor needs               hybrid approach consisting of the core mxj object and a
to hear, identify and compare the sounding events to the             periphery made of numerous C and Java externals, Max
score and to an internalized template—a feat facilitated             abstractions and JavaScript objects (forming the editor’s
by years of intensive training and practice. He/she may be           GUI, among other functions; see Figure 1). The ad-
best served by the representation of music in traditional            vantage of a hybrid system is its high degree of adaptabil-
Guidonian notation, enriched by an extended set of acci-             ity and versatility. As the communication between the
dentals or indications of deviations written above the               core and the periphery is based on messages, they can
notes. This may also be the notation of choice for instru-           easily be intercepted and reinterpreted according to cur-
ments such as standard string or wind instruments. We                rent demand.
will name this approach conventional or cognitive nota-                 The editor handles notation styles like plugins and
tion as it depends on internalized templates.                        loads them as Max patches dynamically from a folder in
   As we are attempting to establish a taxonomy for nota-            MaxScore package hierarchy. It is thus very straightfor-
tional approaches, we also need to concede that these                ward to add new styles to the existing repertoire.
distinctions are arbitrary to a certain extent. Instrumental            Every notation style defines 5 attributes:
and conventional notations have a common root originat-
ing from the logic of the music in practice when its nota-                   A name of the notation style appearing in the
tion was standardized.                                                        style menu of the Staff Manager.
                                                                             The name of the Max patch containing pitch
2.1 Scenarios
                                                                              maps,
One can conceive of the following scenarios in which                         The number of staff lines employed by the style.
dynamic notation may be welcome:                                             The micromap to be used for the rendering of
    All musicians are reading from computer/tablet                            accidentals.
screens of either isolated devices or machines in a net-                     The name of the clef appearing on the staff.
worked arrangement. Alternatively, only the person guid-                     If a non-standard clef is being specified such as
ing the rehearsals/performance uses an electronic device                      Bohlen-Pierce T-clef, an additional definition
while the other members of the ensemble read from pa-                         needs to be given in the form of clef name,
per-based print-outs. In the latter case, the responsibility                  glyph, x and y offsets as well as font name and
lies in him/her to guide the communication on notational                      size.
aspects. Finally, both scores and parts are paper-based,
but they contain, on different staves, alternative represen-
tations of the music to be performed. Even in this case, a
                                                               243
                                                                             Figure 2. 16th-tone notation with the Sagittal font.
                                                                                                                       1
                                                                                       Z  [sgn rnpc n                   ]               (1)
                                                                                                                     n 2
                                                                                                 [1 ∙ 0.125 ∙ 8 + 0.0625] = 1
Figure 1. The nested structure of the MaxScore Editor plugin system.
                                                                                This value is fed into a zone-to-glyph-name lookup ta-
3.1 Micromaps                                                                ble (a Max coll object), which sends out accSagit-
Micromaps were introduced in 2011 to allow higher reso-                      tal5CommaUp, the name of the Sagittal accidental in the
lution representations of divisions of the semitone. While                   Standard Music Font Layout specification on which the
MaxScore’s pitch attribute is stored, processed and                          Bravura font is based [13].
played back in 32-bit floating-point precision, the draw-                       This message is combined with the rest of the message
ing messages generated by the object are limited to quar-                    into
ter tones. Hence, MaxScore would fail to accurately rep-
resent music in eighth tones (the standard among spectral                       “accSagittal5CommaUp 75.555557 81. 0.5 Note 0. 0.
composers) or twelfth tones (used by composers such as                       0. 0.”
Ivan Wyschnegradsky and Ezra Sims). Micromaps are
Max abstractions that intercept drawing messages and                            of which the first four items are further processed:
query the pitch and accidental preferences attribute of the                     The zoom value scales the font size as well as x and y
corresponding note. Based on this information and the                        offsets. The accidental name is sent to another instance of
notation style chosen by the user, a micromap sends new,                     a Max coll object which returns
more fine-grained drawing messages to the MaxScore
canvas. Currently, the maximum precision is sixteenth                          “ -4 0 Bravura 24” (glyph, x offset, y offset, font
tones in Sagittal notation, taking advantage of the enor-                    name, font size).
mous set of accidentals in the Bravura font, just recently
released by Steinberg [13].
                                                                             2
                                                                               no_accidental messages will be ignored by the drawing engine, but are
                                                                             a prerequisite for mapping.
                                                                       244
  This information is then translated into three separate
Max lcd messages:
                                                                          245
                                                                            an e# and a b# between e and f and b and c, resp. The
                                                                            mapping is performed by:
                                                                                 1.   Calculating the 19EDO scale step index with
                                                                                      µUtil.PitchToStep abstraction, which is part of
                                                                                      the author’s µUtilities package bundled with
                                                                                      MaxScore.
                                                                                 2.   Extracting octave index and pitch class by divid-
                                                                                      ing the index by 19 and passing the remainder
                                                                                      through a lookup table yielding the 12EDO pitch
                                                                                      class, accidental preference (sharp or flat) and
                                                                                      enharmonic spelling for any of its 19 pitch clas-
                                                                                      ses.
                                                                                 3.   Calculating pitch by multiplying octave index by
                                                                                      12 and adding the respective 12EDO pitch class
                                                                                      and an offset.
                                                                                    E.g. for 7136 MIDI cents, the scale step index is
                                                                                 113. Divmod 19 yields 5 and 18. Feeding 18 into the
                                                                                 coll returns “11 1 1“, thus setting setAccPref to sharp
                                                                                 and setAltEnharmonicSpelling to true. Pitch is 12 ∙ 5
                                                                                 + 11 + 1 = 72, displayed as b#.
                                                                            Figure 6. The 19EDO notation style takes pitch, accidental and enhar-
                                                                            monic spelling preference into consideration.
Figure 5. The just intonation notation style consists of nested Max
patches illustrating the complexities of the calculations involved.         3.3.3 Percussion
                                                                            Definition: “Percussion percussion 5 mM-none percus-
3.3.2 19EDO
                                                                            sion”
Definition: “19EDO 19EDO 5 mM-none default”
                                                                                Most notation programs implement the percussion no-
    Russian-American musicologist Joseph Yasser argued                      tation style in which MIDI notes (range 35 - 81) are
in his 1932 book Theory of Evolving Tonality that 19-                       mapped to the white keys between d4 and a5. Percussion
tone music, in its just or equal tempered forms, consti-                    notation uses certain positions redundantly, yet differen-
tutes the next logical step in the development of music                     tiates between classes of instruments by assigning various
[14]. While we can no longer subscribe to this claim, this                  notehead shapes to the notes.
tuning remains one of the popular ones, having been                             When switching to the percussion notation style the
investigated by composers such as Easley Blackwood and                      originalPitch attribute is sent to a Max coll (percus-
Joel Mandelbaum. Its 19 tones form a closed circle of                       sionMap) containing:
fifths and, thus, the scale possesses a diatonic subset and                       The name of the instrument,
enharmonic alternatives for each black key, in addition to                        Its (notated) pitch in percussion notation
                                                                                  The corresponding notehead shape.
                                                                      246
    The map will now send three messages to the                                  to the 12-tone chromatic scale and its diatonic subsets
MaxScore core object: “setPitch value”, “noteheadTrans-                          and whose inherent relationships can be learned through
form shape” and “setAltEnharmonicSpelling false”, the                            repeated exposure. To allow for a new theory, we came
latter message to clear double sharps or flats should they                       up with a six-line staff, new note names, interval designa-
have been set previously.                                                        tions and clefs, which we call the Müller-Hajdu notation.
    When switching from the percussion notation style the                        There are three clefs, N, T and Z, for which we created
pitch and notehead attributes are evaluated and sent to                          corresponding chromatic notation styles (with notes ei-
another coll (inversePercussionMap) in order to clear                            ther written without accidentals on a line or between
notehead shapes and reconstruct the originalPitch attrib-                        two)5.
ute. The messages to MaxScore are “noteheadTransform
NOTEHEAD_STANDARD” and “setNoteDimension
originalPitch value”.
                                                                           247
                                                                                       4.   mapped the voices to various instrumental nota-
                                                                                            tions styles
                                                                                       5.   extracted the parts for the musicians using the
                                                                                            editor’s pdf generation capabilities.
                                                                                      Melle Weijters, the Amsterdam-based guitarist in-
                                                                                  volved in the performance of the arrangement actually
                                                                                  plays a 10-string guitar in 41EDO tuning. As the Bohlen-
                                                                                  Pierce scale is actually a subset of this tuning, he only
                                                                                  needed to find the correct positions on the fretboard. He
                                                                                  therefore requested his part in T-clef with an additional
Figure 11. The soprano clarinet fingering notation style uses a lookup
                                                                                  empty 10-line tablature staff to manually notate finger
table to perform mapping.
                                                                                  positions. We are planning to automate this process and
3.3.6 Special Applications: Bohlen-Pierce Alto Kalimba                            make it generally applicable to instruments with standard
                                                                                  and non-standard numbers of frets and tunings. A number
Definition “"BP Alto Kalimba" BP-alto-kalimba 5 mM-                               of papers have already dealt with the intricacies of auto-
none percussion”                                                                  matic tablature transcription for guitar using genetic algo-
                                                                                  rithms, neural networks and hill-climbing algorithms [23]
                                                                                  [24].
Figure 12. The BP Alto Kalimba notation style maps the ascending
pitches of Bohlen-Pierce scale onto the centrifugal layout of the kalimba
tines.
   This feat was accomplished by feeding originalPitch                               Another interesting path to take would be exploration
values through a µUtil.PitchToStep abstraction (to calcu-                         of graphical notation, such as the one employed by
late the Bohlen-Pierce scale step index) and a lookup                             Karlheinz Stockhausen in his Elektronische Studie II
table, yielding the pitch to be displayed.                                        [25]. The technological prerequisites have already been
                                                                                  implemented in MaxScore, but it remains to be seen
  4. PRACTICAL APPLICATIONS AND FUTURE                                            whether the effort of creating such a notation style or set
                    PLANS                                                         of styles, for that matter, is justified in the light of the
One of my recent musical activities was the arrangement                           many individual solutions created by composers over the
of the piano poem Vers la Flamme op. 72 by Alexander                              last few decades, or whether users would be best served
Scriabin [22] for 3 Bohlen-Pierce clarinets, Bohlen-                              by a separate specialized application. Sara Adhitya and
Pierce guitar, double bass in Bohlen-Pierce scordatura,                           Mika Kuuskankare have demonstrated a possible solution
keyboard in Bohlen-Pierce layout, Bohlen-Pierce kalimba                           using macro-events in PWGL [26] for a piece by Logo-
and tam-tam.                                                                      thetis.
   During the arrangement I:                                                         Currently, our dynamic notations system is somewhat
                                                                                  hampered by efficiency issues found in the Max JavaS-
     1.    imported a MIDI file found on
                                                                                  cript object. An effort will be spent to streamline the code
           www.kunstderfuge.com into the MaxScore Edi-
                                                                                  and to replace it with Max C externals, if necessary.
           tor
     2.    mapped the tracks to the Bohlen-Pierce N, T and                                            5. CONCLUSIONS
           Z clefs                                                                We have developed for the MaxScore editor a plugin
     3.    checked for motivic inconsistencies created by                         structure for dynamic notation that greatly facilitates the
           the automatic mapping and changed pitches                              creation and practice of microtonal music in scenarios
           where necessary                                                        where composers, conductors and performers can no
                                                                            248
longer rely on a common notational reference internalized               the 5th International Conference          on   Music
by years of training such as with the 12-tone system.                   Information Retrieval, 2004.
Applying various styles in an arrangement for Bohlen-               [15] J. Yasser, Theory of Evolving Tonality, American
Pierce instruments proved to be a viable approach for                    Library of Musicology, 1932.
editing, printing and rehearsing. More notation styles will
                                                                    [16] N.-L. Müller, K. Orlandatou and G. Hajdu, "Starting
be added as we further develop this version of the soft-
                                                                         Over – Chances Afforded by a New Scale," in: 1001
ware, which currently is in a beta state and can be down-
                                                                         Microtones, M. Stahnke and S. Safari (Eds.). Von
loaded from http://www.computermusicnotation.com.                        Bockel, 2014, pp. 127–172.
[10] http://www.marcsabat.com/pdfs/notat
                                                                      Webpages all accessed on January 28, 2015.
     ion.pdf.
[11] G. Read, Music Notation: A Manual of Modern
     Practice. Crescendo Book, 1979, p. 32.
[12] G. Hajdu and N. Didkovsky, "MaxScore – Current
     State of the Art," Proceedings of the International
     Computer Music Conference, 2012.
[13] http://www.smufl.org.
[14] M. Kuuskankare and M. Laurson, "Expressive
     Notation Package - an Overview," Proceedings of
249