Proceedings of the 2012 2nd International Conference on Computer and Information Application (ICCIA 2012)
A Method on Single Source Publishing for Music in DITA
Xuhong Liu, Yunmei Shi, Peng Liu, Ning Li
Computer School, Beijing Information Science & Technology University,
Beijing, China
E-mail:
[email protected]Abstract—Federated Media Publishing means to distribute the producing and delivering topic-oriented, information-typed
single source content through multimedia after integrating content that can be reused and single-sourced in a variety
various media manners, which is the trend of digital publishing. ways [2-3]. IBM donated DITA to the OASIS standards
DITA is designed to produce multiple deliverable formats from organization in 2004. DITA is widely used by technical
a single set of DITA content, but it can’t be directly used in enterprise currently in technology documents authoring,
Federated Media Publishing for not coping well with digital publishing and enterprise informationization (for
multimedia except for text and image. This paper takes music example knowledge management, content management and
as an example to discuss the method of integrating multimedia document management) as so on [4].
into DITA. The specialization facility is used to integrate
Although DITA provides a better solution for content
MusicXML DTDs into DITA and a solution is putted forward
to provide appropriate rendition distinctions according to
management and reuse, it can’t be directly used for federated
audio and five-line staff. Experiments show the solution is media publishing as not supporting multimedia very well.
efficient to extend DITA for dealing with multimedia, which Therefore, it is necessary to extend DITA architecture to
can be used in federated media publishing. support multimedia and be used to federated media
publishing in the future.
Keywords—Darwin Information Typing Architecture; The publication of music is very complex for rich
Federated Media Publishing; MusicXML; Single Source diversity of presentation forms, therefore, this paper take
music as an example to discuss the technology to extent
I. INTRODUCTION DITA architecture for supporting multimedia, and then can
be used in federated media publishing in the future. So far as
During the last ten years, Federated media publishing is known to the writer, there is no related research in this
industry in china has been developed rapidly with the mobile subject yet
terminal widely used and the popularity of the Internet. DITA-aware commercial tools include Arbortext by PTC,
Federated media publishing industry is a new form and FrameMake by Adobe and so on. The DITA Open Toolkit is
growing tendency of digital publishing. Federated media an implementation of the OASIS DITA Technical
publishing means to distribute the single source content Committee's specification for DITA DTDs and Schemas.
through multimedia after integrating various media manners The Toolkit transforms DITA content (maps and topics) into
[1]. Based on a single source file contains music information, deliverable formats [5].
for example, appropriate rendition distinctions are provided MusicXML is an XML-based file format for representing
according to audio and five-line staff. The federated media western musical notation. MusicXML is developed by
publishing needs to establish perfect content creation and Recordare LLC, It is designed for the interchange of scores,
management mechanism to manage all kinds of digital particularly between different score writers [6].
contents for single source publishing. This paper discusses the method to integrate musicXML
There exists several techniques used for content into DITA OT after musicXML Specialization, and of
organizing, which can be divided into three types, paginated implementation single source publishing for DITA document
document, flow document and the combination of paginated that contains musicXML content. The DITA document can
and flow document. The typical paginated documents be delivered in different target media according to user’s
include PDF(Portable Document Format), XPS((XML Paper requirement.
Specification) and CEB(Chinese eBook), the typical flow
documents include TXT, HTML(Hypertext Markup II. DITA OVERVIEW
Language), DOC\DOCX (Microsoft Office Open XML) and
Epub(Electronic Publication) as so on.. The combination of A. DITA Application Architecture
the above two types is CEBX(Common e-Document of DITA is sophisticated XML-based application
Blending XML) as so on. However, the content organizing architecture for authoring, producing, and delivering
forms above are not suitable for federated media contents, as information. DITA has predefined DTDs and Schema
the content can’t be disassembled and reused after the whole document shells necessary for digital publishing based on
content is established, which results in content redundancy. which the workflow from content authoring to content
With the development of digital publishing, other forms delivery is established. The whole process is illustrated in
of content organizing appeared, such as DITA ( Darwin figure 1, specific processes are as follows
Information Typing Architecture)which is developed by (1) Specialization (optional). The specialization feature
IBM. DITA is an XML-based architecture for authoring, of DITA allows for the creation of new element types and
Published by Atlantis Press, Paris, France.
© the authors
0312
Proceedings of the 2012 2nd International Conference on Computer and Information Application (ICCIA 2012)
<topic>
Parent Type
<title/> HTML
<shortdesc/> XSL Style PDF
<prolog/> Sheet
<body/> <related- ……
XSLT
links> …… Transform
Child Type
</topic>
specialization Topic-Based Writing Topic Warehouse DITA Map Produce Different
Deliverable
Fig.1 DITA Process
attributes that are explicitly and formally derived from the relationship between the four topics. Moreover,
existing types. The resulting specialization allows for the specialized information types, based on the original four, can
blind interchange of all conforming DITA content and a be defined as required.
minimum level of common processing for all DITA content. Using the same technique as specialization for topics,
It also allows specialization-aware processors to add DITA allows the definition of domains of special vocabulary
specialization-specific processing to existing base processing. that can be shared among infotyped topics. Domain
(2) Topic-Based Writing. Separate information into specialization doesn’t define new typed topics but new
appreciate granularity content chunk to generate different elements and attributes in vocabulary module..
topic types. A topic has just enough content to make sense The two specializations both define elements and entities
by itself but not so much content that it covers more than one in module files(.mod and .ent file), then integrate them into
procedure, once concept, or one type of reference DTD files of special topic type, the specialization document
information. shell includes new vocabulary module and all ancestor’s
(3) Topic Warehouse. The contents are stored module files, so the elements defined by ancestor are valid in
according to their topic types; each topic is stored as a single the new DTDs. Moreover, special code module and XSL
unit which composes the topic warehouse. style sheet for processing new elements and attributes should
(4) DITA Map. The topics chosen from topic be provided in deliverable if necessary.
warehouse are assembled to DITA document according to
DITA map. C. DITA One-Source Publishing
(5) Production of Different Deliverable. DITA is During the fifth step in DITA process, DITA OT
designed to produce multiple deliverable formats from a provides a series XSL style sheets to produce different
single set of DITA content. DITA supports the separation of deliverable forms from a single source. Figure 3 provides an
content from presentation. The style sheet and XLST overview of the processing and publishing of DITA
transformations produce different deliverable formats from a documents using DITA Open Toolkit:
single source. (1) In step 1, the Ant build tool is initialized (either
through an optional batch script or directly from the
B. DITA Specialization command line), and arguments that will be passed from the
In DITA, a topic is the basic unit of authoring and reuse. Ant script to Ant are validated.
A topic is a unit of information that describes a single task, (2) In step 2, the Ant build tool calls the Toolkit, which
concept, or reference item. The information category produces the rendered temporary files. Input to this step is
(concept, task, or reference) is its information type (or the .dita and .ditamap source files, and the DITA project
infotype). A new information type can be introduced by DTDs or schemas.
specialization from the structures in the base topic DTD. (3) In step 3, the XSLT processor (SAXON or Xalan)
Typed topics are easily managed within content management produces the published output files. Input to this step are the
systems as reusable, stand-alone units of information. XSLT style sheets for the project, and the temporary files
DITA provides two kinds of specialization, one is DITA produced in step 2.
typed topic specializations (infotyped topics), the other is There are two reasons for DITA architecture not
DITA vocabulary specialization (domains) [7]. supporting music very well according to above process.
The typed topics represent the fundamental structuring Firstly, DITA can’t recognize elements defined in
layer for DITA topic-oriented content. The basis of the musicXML. Secondly, DITA doesn’t provide code module
architecture is the topic structure, from which the concept, and style sheet for processing musicXML elements. So the
task, and reference structures are specialized. Extensibility to two problems must be resolved in order to extend DITA to
other typed topics is possible through further specialization. support music. The following in this paper discusses the
The four information types (topic, concept, task, and resolution in detail.
reference) represent the primary content categories used in
the technical documentation community. Figure 2 illustrates
Published by Atlantis Press, Paris, France.
© the authors
0313
Proceedings of the 2012 2nd International Conference on Computer and Information Application (ICCIA 2012)
III. MUSICXML SPECIALIZATION IV. IMPROVE DITA CONTENT DELIVERY
The essence of DITA architecture is inheritance which User can author DITA document that contains music
enables reuse the existing information and code module in information after above specialization. The following DITA
DITA architecture. User need only concentrate on the new document is concept topic type integrating music domain
DITA elements. New DITA element should inherit from module, <conbody> which contains the main body of the
existing DITA elements and so can be mapped to its ancestor. document is defined in concept topic type. Music “happy
Every element type exists in a specialization hierarchy, new year” is nested into the document by <musicxml>
which goes from the base module (topic or map) through any element, only part of the document is listed due to limited
intermediate modules to the element itself. Every DITA space.
element must have a @class attribute. The value of the class
attribute is the specification of the specialization hierarchy <!—concept topic type DITA document integrated with music domain
for the element. The new element can use the translation module -->
<!DOCTYPE concept SYSTEM "dtd\concept.dtd">
rules for its parents. <concept id="testmusicdomain">
Elements related to music notation are defined in <conbody>
musicXML, but they have no class attributes identify their <musicxml>
specialization hierarchy which means the elements in ... ...
musicXML are non-DITA elements. Therefore, the <score-partwise>
<part-list>
<foreign> element should be specialized to integrate the non- <score-part id="P1">
DITA elements in musicXML. Specialization of the <part-name>Piano</part-name>
<foreign> element is an open extension to DITA for the <part-abbreviation>Pno.</part-abbreviation>
purpose of incorporating standard vocabularies for non- <score-instrument id="P1-I3">
textual content. The <foreign> element can be specialized <instrument-name>Piano</instrument-name>
both in topic specializations and domain specializations [8]. </score-instrument>
</score-part>
This paper chooses domain specialization as an example to </part-list>
illustrate the process. <part id="P1">
(1) Establish Music Domain Module Files <measure number="1" width="191.55">
① Define music domain module’s name, such as <note-music default-x="83.43" default-y="-35.00">
<pitch>
“musicxml-d”, which is related to the specialization <step>F</step>
hierarchy identified by class attribute of the new element. <octave>4</octave>
② Integrate MusicXML DTDs into music domain </pitch>
module and establish the new element <musicxml> as listed <duration>1</duration>
in the following code. <score-partwise> is the child element <voice>1</voice>
<type>eighth</type>
of <musicxml> while the root element in musicXML. The <stem>up</stem>
non-DITA elements defined in MusicXML DTD are <beam number="1">begin</beam>
integrated by this way. </note-music>
③ Define class attribute of the new <musicxml> element, ... ...
</measure>
this attribute shows the <musicxml> element belongs to <measure number="2" width="120.12">
musicxml-d and inherited from <foreign> element in topic. It <note-music default-x="12.00" default-y="-25.00">
is worth noting that only <musicxml> is DITA element, the <pitch>
other elements in musicXML DTD, such as <score- <step>A</step>
partwise>, are non-DITA elements, so have no class attribute. <octave>4</octave>
</pitch>
The main code in music domain module file is listed as
<duration>1</duration>
follows. <voice>1</voice>
<type>eighth</type>
<!—main code in music domain module file --> <stem>up</stem>
<!-- integrate MusicXML DTD --> <beam number="1">begin</beam>
<!ENTITY % musicxml-partwise_dtd SYSTEM </note-music>
"partwise.dtd" > %musicxml-partwise_dtd; </measure>
<!—define new element for music domain--> ... ...
<!ENTITY % musicxml.content "((score-partwise)?)">
</score-partwise>
<!ENTITY % musicxml.attributes ' '>
</musicxml>
<!ELEMENT musicxml %musicxml.content;>
<!ATTLIST musicxml %musicxml.attributes;> </conbody>
<!—declare new element’s class attribute --> </concept>
<!ATTLIST musicxml %global-atts; class CDATA "+ topic/foreign As noted before, the new DITA element and attribute
musicxml-d/musicxml"> must be established based on those in the more general topic
type, and then the transformation rules for ancestor element
(2) The music domain module in (1) can be shared by any and attribute can also be used to the new DITA element and
DITA topic type, see reference [9] for details。 attribute. Therefore, one can process <musicxml> element in
the same way as <foreign> element. However, DITA doesn’t
Published by Atlantis Press, Paris, France.
© the authors
0314
Proceedings of the 2012 2nd International Conference on Computer and Information Application (ICCIA 2012)
provide support for music and can’t translate <musicxml> B. Presentation in Audio Media
element and its descendants accurately. Therefore, the new If target format is audio, solution for deliver DITA
special translation rules and code module should be added to document that contains music information to audio file
DITA for processing the specialization element.
should be built.
Music can be presented as audio and five-line staff
There are several audio formats among which MIDI
according to user’s requirements; the paper will discuss the
method to generate two presentation forms. (Musical Instrument Digital Interface) is an electronic
musical instrument industry specification that enables a
wide variety of digital musical instruments, computers and
A. Presentation in Vision Media other related devices to connect and communicate with one
If the target document is PDF and HTML, the music another. It is a set of standard commands that allows
information in DITA document should be presented as five- electronic musical instruments, performance controllers,
line staff nested in the target. computers and related devices to communicate, as well as a
Many document formats and tools, such as PDF reader hardware standard that guarantees compatibility between
and explorer, don’t support musicXML currently, but may them. MIDI file is very compact and can be translated to
recognize SVG markup language or image. Two different other format, so the following will discuss the solution of
solutions are designed according to whether or not the target translation DITA document including music information to
document supports SVG. MIDI file.
For those target documents formats that support SVG, The key problem to be resolved is to write code
like HTML5, can translate the <musicxml> content in DITA module for translating DITA document including music
document to SVG, and then embedded within target information to audio format, namely, a java class for
document, the details are as follows.
outputting audio format file should be built and deployed to
(1) Extract <musicxml> element and its all descendant
element; build xsl translation according to reference [10] for DITA OT. The java class must extend
translating the selected elements into SVG format. org.apache.tools.ant.Task, and implement translation
(2) Replace <musicxml> content in DITA document with function in method execute(), because DITA OT processes
SVG format content obtained in the last step. documentation project as an Ant project. The details are as
(3) Translate DITA document by existing xsl style sheet follows:
in DITA OT to different formats according to user (1) Extract <musicxml> element and its all descendant
requirement. element.
If the target document is PDF, one can translate (2) Take the content of each <measure> element in
<musicxml> content to FO file including SVG by xsl <musicxml> as a measure in music score. Take the content
translation file designed in the first step, as FOP provides of each <note-music> element as musical note in music
function to translate FO document that contains SVG to PDF. score. Then, build MIDI message that will be integrated into
The xsl file built here need to be deployed to DITA OT track according to measure and note information, the output
as a plug-in, see reference [5] for details. MIDI file is generated finally.
For those target documents formats that not support SVG, The new java class will be treated as a new rule for
as almost all document formats support image, the processing <musicxml> element and be deployed to DITA
<musicxml> content can be translated to five-line staff OT by the extension points for plug-in mentioned in
image and then embedded in target document by the reference [5]. DITA OT can deliver audio file now.
following procedure.
(1) Extract <musicxml> element and its all descendant V. IMPLEMENTATION AND TEST
element and then translate them to SVG.
(2) Rasterize SVG by Apache Batik and Rasterizer Ant A single source DITA document including music
task. The two tools are open and can translate SVG file to information can be delivered to different formats, such as
four formats—PNG, JPEG, TIFF, PDF[11]. HTML, PDF and audio format, by extending DITA OT to
(3) Replace the <musicxml> content in DITA document support musicXML.
with rasterizing five-line staff image. Figure 4 illustrates the presentation of music
(4) Publish DITA document by existing xsl file in DITA information in HTML file which is transformed from the
OT. DITA document listed in section 3, user can browse five-
line staff and play audio in browser. Figure 5 is the
presentation result of music information in PDF document
that transformed from the same DITA document.
Published by Atlantis Press, Paris, France.
© the authors
0315
Proceedings of the 2012 2nd International Conference on Computer and Information Application (ICCIA 2012)
Fig.4 Presentation of Music Information in HTML File
Fig.5 Presentation of Music Information in PDF File
The experiment results show that the method provided This work is also supported by funded project in
in this paper can extend DITA to support music information Research on key technology of multimodal output for music
and produce different deliverable forms from a single source in DITA under the Grant KM201311232012 from Beijing
including music score. This provides insights to integrate Municipal Education Commission, China. We would like to
other multimedia into DITA. thank the funding agency in supporting this research.
We would like to thank the funding agency in supporting
VI. CONCLUSIONS AND FURTHER RESEARCH this research.
The federated media publishing needs to establish perfect REFERENCES
content creation and management mechanism to manage all
[1] Y. Zhang, “Federated media publishing: Present Status and Future
kinds of digital contents for single source publishing. Development,” Modern Publishing. Beijing, vol. 2, pp. 14-17, 2011.
Although DITA provides a better solution for content
[2] OASIS, “OASIS Darwin Information Typing Architecture (DITA)”
management and reuse, it can’t be directly used in federated TC. (2009-5-12). http://www.oasis- open. org/ committees/dita/
media publishing for not supporting multimedia very well. [3] M. Priestley, “DITA XML: a reuse by reference architecture for
This paper puts forward a method to extend the DITA technical documentation,” Proceedings of the 19th annual
architecture to support music score; the single source DITA international conference on Computer documentation, ACM Press,
document can produce different deliver forms according to 2001, pp. 152 – 156.
user’s requirement. The solution in this paper can be [4] F. Wei, “A Study on Darwin Information Typing Architecture,”
extended to other multimedia. Journal of Intelligence, Xi’an, vol.28 pp. 172-175, 2009.
Furthermore, the solution in this paper does not take [5] DITA Open Tool. (2012-9-17). http:// dita-ot.sourceforge.net/
account of synchronism problem in multimedia, karaoke, for [6] L.M. Surhone, M.T. Tennoe, S.F. Henssonow. Musicxml.
Saarbrucken, Germany: VDM Publishing House Ltd, 2010
example, how to translate DITA document including music
score, video and lyrics, to different deliverable? How to [7] M. Priestley, D.A. Schell, “Specialization in DITA: Technology,”
Process, & Policy, ACM International Conference on Design of
represent them synchronously? The future research will Communication, ACM Press, 2002, pp. 57-64.
resolve these issues. [8] Foreign generalization. (2007-8-1). http:// docs.oasis-
open.org/dita/v1.1/OS/archspec/foreigngeneralization.html
ACKNOWLEDGMENT
[9] K. Eliot, “DITA for Practitioners Volume 1: Architecture and
This work is supported by funded project in Funding Technology,” USA: XML Press, 2012.
Project for Academic Human Resources Development [10] MusicXML to SVG. (2011-11-2). http://
in Institutions of Higher Learning Under the Jurisdiction of zh.sourceforge.jp/projects/sfne_music xmltosvg/
Beijing Municipality under the Grant PHR201008439 from [11] Batik SVG Toolkit. (2010-2-2). http:// xmlgraphics.apache.org/batik/
Beijing Municipal Education Commission, China.
Published by Atlantis Press, Paris, France.
© the authors
0316