Ict in Library Science Ignou
Ict in Library Science Ignou
Study Materials
ICT in Libraries
JATINDER SINGH
BLIS (JULY-2018)
www.jatinderjyoti.in
[email protected]
fb/insta: jatinderjyoti.raina
BLIE-229
ICT in Libraries
Indira Gandhi
National Open University
School of Social Sciences
Block
1
LIBRARY AUTOMATION
UNIT 1
Introduction to Library Automation 5
UNIT 2
Library Automation Processes 50
UNIT 3
Library Automation – Software Packages 91
UNIT 4
Library Automation: Application of Open Source Software 141
Programme Design Committee
Prof. Uma Kanjilal (Chairperson) Prof. S.B. Ghosh, Retired Professor
Faculty of LIS, SOSS, IGNOU Faculty of LIS, SOSS, IGNOU
Prof. B.K.Sen, Retired Scientist Prof. T. Viswanathan
NISCAIR, New Delhi Retired Director, NISCAIR, New Delhi
Prof. K.S. Raghavan, DRTC Dr. Zuchamo Yanthan
Indian Statistical Institute, Bangalore Faculty of LIS, SOSS, IGNOU
Prof. Krishan Kumar, Retired Professor Conveners:
Dept. of LIS, University of Delhi, Delhi
Dr. Jaideep Sharma
Prof. M.M. Kashyap, Retired Professor Faculty of LIS, SOSS, IGNOU
Dept. of LIS, University of Delhi, Delhi
Prof. Neena Talwar Kanungo
Prof. R.Satyanarayana Faculty of LIS, SOSS, IGNOU
Retired Professor, Faculty of LIS, SOSS
IGNOU
Dr. R. Sevukan
(Former Faculty Member) Faculty of LIS
SOSS, IGNOU
AUTOMATION
Structure
1.0 Objectives
1.1 Introduction
1.2 Evolution of Library Automation
1.3 Automated Library Systems
1.3.1 Rationale
1.3.2 Prerequisites and Steps
1.3.3 Procedural Model
1.3.4 Traditional, Automated and Digital: Three Eras of Library Systems
1.4 Automated Library System: Standards and Software
1.4.1 Standards
1.4.2 Software
1.5 Automated Library System: Global Recommendations
1.5.1 OLE Recommendations
1.5.2 ILS-DI Recommendations
1.5.3 Request for Proposals (RFPs)
1.6 Automated Library System: Development of RFP
1.7 Automated Library System: Trends and Future
1.8 Summary
1.9 Answers to Self Check Exercises
1.10 Keywords
1.11 References and Further Reading
1.0 OBJECTIVES
After going through this Unit, you will be able to:
• understand conceptual views related to library automation and evolution of
ILS;
• know features, advantages, requirements, steps, standards and models of
library automation; and
• trace the path of progress and future directions in the development of ILS.
1.1 INTRODUCTION
Library services require a series of works like acquiring, preparing and organising
documents of different types and available in many formats. The activities related
to acquisition of documents, technical processing of acquired documents,
circulation and maintenance of processed documents are known as housekeeping
operations. In a traditional library system (managed manually) these time
consuming, labour intensive activities and routine clerical chores are performed
slowly and expensively by library staff. Libraries all over the world, right from
1970s (with the advent of Personal Computer) are increasingly attempting to 5
Library Automation automate some of these activities for minimising human clerical routines and
thereby optimising productivity and creativity of library staff. Library automation
is the generic term that denotes applications of Information Communications
Technologies (ICT) for performing manual operations in libraries of any type or
size. Library automation process can adopt three routes – i) a piecemeal approach,
converting individual operations one at a time (for example installation of
Cataloguing module alone to offer OPAC); ii) the process can work towards the
integrated system progressively, using a ‘planned installation’ approach (for
example installation of Member management module and Circulation modules
after the Cataloguing module); and iii) it can go directly for a fully integrated
system to cover operations of all subsystems in the library. Therefore, theoretically,
a typical library automation may or may not be integrated and may or may not be
applied on a Local Area Network (or Intranet). In such automation process, the
functions that may be automated are any or all of the followings: acquisition,
cataloging, member management, circulation, serials control, inter library lending,
and access to online public access catalogue. But the radical development in
hardware, software and connectivity along with the reduced costs paved the path
for integrated library automation systems (ILS). Presently, library automation
processes are integrated systems of a set of interlinked modules responsible for
the management of different operational subsystems.
Cataloguing
OPAC
User Librarian
8
The Second Automation Age: This period of library automation was Introduction to Library
Automation
characterised by the rise of public access i.e., the arrival of OPAC as a replacement
for the traditional card catalogue. This period also witnessed major developments
in online access to abstracting and indexing databases, union catalogues, resource
sharing networks and library consortia.
The Third Automation Age: This era was characterised by the full text access
to electronic documents over high-speed communication channels. Digital media
archiving was an important element of library automation in this period. The
advent of Internet as global publishing platform and largest repository of
information bearing objects revolutionised the ways and means of delivering
library services. As a result, Web-centric library automation was norm of the
time.
The Fifth Automation Age: The next generation library automation uses
interactive, collaborative and participative platform for developing user-oriented
library services with the help of Web 2.0 tools and services. This era of library
automation also characterised by the capabilities to on-the-fly integration of
Linked Open Data (LOD) with local library resources and operations (for example
- utilisation of global dataset VIAF (Virtual Internet Authority File) in managing
name authority file of local library catalogue, and integration social networking
tool such as Facebook with OPAC to post Like against a library document).
Cloud based library management and Web-scale library management are norms
of the fifth automation age.
Now you know the phases of development in library automation for almost the
last forty-five years. However, a time line for the development of ground-breaking
events in library automation can be a handy tool for you to grab the path of
development.
1.3.1 Rationale
Society is changing and so are the library users. There are many reasons of the
ongoing changes but the most visible one is the impact of ICT on society. As a
result libraries need to change to keep pace with these societal changes. It is also
required for libraries to get continued support – political and financial from parent
organisation as well as from government. However, the rationale for library
automation may be summarised as below:
• Automation of library housekeeping operations is considered as an especially
critical area from which future benefits will emerge. It means that if a library
is not automated it cannot take advantages contributed by ICT such as
digitisation, web-enabled library system, use of linked open data, remote
management of library, interactive user services etc. ;
• Increased operational efficiencies are achieved through library automation;
• Automation of housekeeping operations relieves professional staff from
routine clerical chores and thus make them available for end-users services;
• Betterment of library services in terms of speed, quality and efficiencies;
• Automation may create interactive, collaborative and participative platform
for user-centric library services;
• Supports improvement of existing services and introduction of new services;
• Makes library free from two fundamental barriers of information access –
time and space. A web-enabled library system allows access at anytime from
anywhere and by anyone;
• Automated library system with the capability to generate extensive reports
and statistics extends support as decision-making tool for library managers
and policy makers;
12
• An automated library system is able to contribute to resource-sharing Introduction to Library
Automation
networks and on the other hand may take the benefits of resources and
services of library networks; and
• Better management of staff, physical resources, financial resources and wider
dissemination of information products and services.
But at the same time one should remember that library automation requires huge
initial investments in developing network infrastructure, procuring hardware,
buying/customising software, retraining of staff or in some cases recruitment of
technical staff. It may lead to chaos in resource organisation and dislocations in
user services during transformation phase. Initially users and staff may feel
uncomfortable, but with the passing of time the benefits of library automation
will be realised by all stakeholders. As ICT has spillover effect, an automated
library system, after initial teething problem, soon begins to search other areas
for extension of bibliographic services.
System-level requirements
The system level requirements include hardware, network and storage. These
components build the necessary infrastructure for implementation of integrated
library system. The infrastructural requirements for library automation may vary
from simple (inexpensive) to very complex (expensive) depending on factors
like functional requirements, software architecture, support for global domain-
specific standards, interoperability requirements, number of library sites or
branches, number of records to be managed, number of users to be supported,
requirements for managing multi-lingual records, retrieval features, federated
search capabilities etc. The infrastructural requirements is very high for an
automated library system that aims to serve users through Web-OPAC (requires
server, IP address and domain name), to support distributed cataloguing (to serve
bibliographic data as Z39.50 server), and to take the advantages of cloud
computing. Generally hardware level requirements include Server (a centralised
mainframe or minicomputer architecture) and client PCs (low-end computers
for data entry and end-user searching). Storage devices are required to store
bibliographic data (full-text data in case of digital media archiving). Network is
required to link server with storage devices and client PCs.
Software-level requirements
An integrated library system is managed by integrated library management
software (LMS). LMS manages different functional modules (for different sub-
systems of a library) on the basis of a common database (with different tables for
13
Library Automation different modules in relational model). Such a LMS supports seamless exchange
of data (bibliographic data, financial data, member data etc.) between the different
subsystems of an integrated library system. The essential features that should be
supported by an ILS (or LMS) must be known before selection of software.
These are applicable to all modules of any modern LMS and should include but
not limited to the following features:
• The LMS must be fully integrated, using a single, common database for all
operations and a common operator interface across all modules;
• The LMS should have capability of supporting multiple branches or
independent libraries, with one central computer configuration sharing a
common database;
• The LMS must allow unlimited number of records, users and organisation-
specific parameters (e.g. loan period rules, fine calculation criteria, hold
parameters etc.);
• The package should include following fully developed and operational
facilities at multiple customer sites:
• Bibliographic and inventory control • Z39.50 sever (minimum
• Authority control version 3 and bath profile level
• Public access catalogue complaint) and Z39.50 client
• Web catalogue interface • Z39.50 copy cataloguing client
• Information gateway (telnet, www, • Marc 21 bibliographic and
Z39.50, proxy server) authority record import/export
utility
• Acquisition management
• Outreach services
• Serials control
• Digital media archive system
• Electronic data interchange (EDI) and Multimedia
• Reservation and materials booking • Fund accounting , Bills and
• Circulation control fines
• Customised generation of reports • Inter library loan
and usage statistics • Interoperability and crosswalk
• One step administrative parameters • Web 2.0 supports
setting
• LMS must provide continuous backup in suitable media (as per the choice
of libraries) so that all transactions can be recovered to the point of failure;
• LMS must be compliant with the following standards (see section 1.4.1 for
a list of standards):
• Z39.50 information interchange format
• MARC 21, UNCODE (UTF-8 OR UTF-16)
• Z39.71 holdings statements
• Z39.50 information retrieval service (client and server version3)
• EDIFACT (EDI standard)
• IEEE 802.2 and 802.3 Ethernet
• HTTP, TCP/IP, Telnet, FTP, SMTP
14
• The LMS should be based on web-centric architecture and extend support Introduction to Library
Automation
for a range of multi-user and multitasking operating systems and RDBMSs;
• The LMS must be compliant with UNICODE standard for multilingual
support and RFID for inventory management and self-issue/return facility;
• Vendor/Developing group should provide training to enable library staff to
become familiar with system functions and operation, should supply full
and current system documentation in hard copy and in machine-readable
form suitable for online distribution and the LMS should include extensive
online help for users and staff;
• LMS must support multiple hardware architecture in terms of server, network
infrastructure, PC-workstations and peripheral devices;
• LMS must be supported with regular maintenance and on-call service,
periodical software upgrades, continuous R & D, trouble-shooting of third-
party software such as database package and the library automation package,
distribution of problem fixes/patches and emergency services for system
failures and disaster recoveries;
• The package must provide security to prevent accidental or unauthorised
modification of records through the establishment of access privileges unique
to each user on the system and restriction of specific functions to specific
users;
• LMS should provide graphical user interface including, but not limited to
extensive online help, user self-service and personalisation features. The
system should be supported with PC-based alternative that will allow
circulation to continue in the event of system failure, communication failure
and downtime required for maintenance;
• LMS must be compliant with web 2.0 features to support interactive,
collaborative and participative platform; and
• LMS should be updated regularly to take advantages of cutting-edge
technologies like cloud computing, linked open data and semantic web.
Steps of library automation
Library automation is a complex process and should be planned astutely. The
complete process of library automation may be divided into following steps:
• Software selection
• Hardware selection
• Site preparation
• General training
• Customisation
• Defining procedures for
o Bibliographical data entry
o Administrative data entry
o Financial data entry
• Commissioning
15
Library Automation It is quite obvious that implementation of the above steps in library automation
requires background study or analysis of the library system (see section 1.3.3 for
system analysis process). It is a precondition to utilise library automation package
for effective results. A library will not be able to take full advantages of automation
until and unless it’s manual functions are perfect and justified. Therefore, the
procedures and tasks followed in different sections should be analysed in terms
of :
• Special features of the library system
• Local variations (their validity and usefulness)
• Limitations of the existing system
• Nature and objectives of library
• Total number of collection and nature of collection
• Per year acquisition and procedures followed for acquisition
• Per year subscription of serials and number of back-volumes
• Number of users and their categories
• Per day transactions (issue/return/reservation)
• Availability of multilingual documents
• Need of information services (CAS/SDI etc.)
• Future plan (in terms of networking and consortia, digitisation, cloud
computing)
• Available manpower (computer literate staff, retraining of staff, recruitment
of technical staff).
This is an illustrative list of factors to be considered during the process of library
automation. In reality a library needs to prepare a comprehensive of list of such
factors for effective utilisation of the automated library system.
1.4.1 Standards
Standards are developed by general agreement among stakeholders of an area of
human activity. These are used by professional like scientists, engineers,
technologists etc. for their respective domain of activities. We often use the terms
standards, guidelines and specifications synonymously. A “guideline” is a
statement of policy by a person or group having authority over an activity. A
“standard” is formulated by agreement and applicable to an array of levels –
corporate, national, or international. A “specification” is a concise statement of
21
Library Automation the requirement for a material, process, method, procedure or service. Standards
are frequently updated, modified or revised to keep pace with the technological
changes and practical requirements (Withers, 1970). ANSI (American National
Standards Institute) defined a standard as a specification accepted by recognised
authority as the most practical and appropriate current solution of a recurring
problem. IEC Guide 2:2004 of ISO (International Standards Organisation) defines
a standard as a document, established by consensus and approved by a recognised
body, that provides, for common and repeated use, rules, guidelines or
characteristics for activities or their results, aimed at the achievement of the
optimum degree of order in a given context. Standards perform important roles
in the development of integrated library systems in view of the followings:
• to act as the pattern of an ideal;
• to set a model procedure;
• to achieve interoperability in heterogeneous environment;
• to establish measure for appraisal;
• to act as stimulus for future development and importance; and
• to help as an instrument to assist decision and action.
Standards are mainly developed by Standards Development Organisations
(SDOs). An SDO is any entity whose primary activities are developing,
coordinating, promulgating, revising, amending, reissuing, interpreting, or
otherwise maintaining standards. SDOs are generally grouped by two parameters
– geographic designation (e.g. international, regional, national) and organisational
authority (e.g. governmental, quasi-governmental or non-governmental entities).
Library professionals are generally interested in the library standards developed
by their national standard organisations (e.g. BIS – Bureau of Indian Standards
in India) and library standards developed by ISO (International Standards
Organisations), NISO (National Information Standards Organisation, US) and
BSI (British Standards Institute, UK). The library standards developed by NISO
are American national standards but in many cases these standards are used by
libraries/related organisations across the globe (e.g. Z39.50). These SDOs develop
standards in the domain of library services through designated committees and
sub-committees. The committee IDT/2 is entrusted by BSI (http://www.bsi-
global.com/) for Information and Documentation. There are mainly three
American National Standards Committees under NISO that develop standards
affecting libraries, information services and publishing (www.niso.org). These
are X3 (Information Processing Systems); PH5 (Micrographic Reproduction);
Z85 (Standardisation of Library Supplies and Equipment); and Z39 (Library and
Information Sciences and Related Publishing Practices). Of these, Z39 has
developed more standards directly related to LIS fields than others. TC 46
committee of ISO (www.iso.org/iso/) is responsible for standardisation of
practices relating to libraries, documentation and information centres, publishing,
archives, records management, museum documentation, indexing and abstracting
services, and information science. The secretariat of TC 46 is in France (AFNOR
- Association française de normalisation). It works through three working groups
(WG), four sub committees (SC) and one coordinating group (CG). In BIS, India,
MSD 5 (www.bis.org.in) is the Sectional Committee for Documentation and
Information.
22
Although it is difficult to list all the standards related to automated library systems, Introduction to Library
Automation
we may go for listing a set of minimum standards that need to be supported by
an ILS/LMS to remain globally competitive and interoperable. These are:
• ISO – 2709 for bibliographic data interoperability;
• Standard bibliographic formats compliant with ISO - 2709 (e.g. MARC 21,
UNIMARC, CCF/B);
• Z39.50 protocol standard for distributed cataloguing;
• Z39.71 standard for holdings statements;
• BS ISO 9735-9:2002 Electronic data interchange for administration,
commerce and transport (EDIFACT);
• Z39.83-1 (NISO Circulation Interchange Part 1: Protocol (NCIP));
• Z39.83-2 (NISO Circulation Interchange Part 2: Protocol (NCIP));
• ISO/CD 28560-1(Information and documentation — Data model for use of
radio frequency; identifier (RFID) in libraries — Part 1: General requirements
and data elements);
• ISO/CD 28560-2 (Information and documentation — Data model for use of
radio frequency; identifier (RFID) in libraries — Part 2: Encoding based on
ISO/IEC 15962); and
• ISO/CD 28560-3 (Information and documentation — Data model for use of
radio frequency identifier (RFID) in libraries — Part 3: Fixed length
encoding); and
• ISO/IEC 10646: 2003 (Universal Multiple-Octet Character Set or UCS).
Apart from these formal standards (de jury standards), there are a few
specifications (may be considered as de facto standards) in the domain of library
services, which are widely in use across different library systems in different
countries. Most of these internationally agreed upon informal standards are
developed by national libraries (e.g. Library of Congress) and library associations
(e.g. ALA, IFLA etc.). Some of these very important non-formal standards are –
• MARCXML – MARC 21 data in an XML structure (developed by Library
of Congress - http://www.loc.gov/standards/marcxml/) acting as base standard
for bibliographic data export/import in place of ISO-2709;
• MODS (Metadata Object Description Standard) – XML markup for selected
metadata from existing MARC 21 records as well as original resource
description (developed by Library of Congress – http://www.loc.gov/
standards/mods/);
• MADS (Metadata Authority Description Standard) – XML markup for
selected authority data from MARC21 records as well as original authority
data (developed by Library of Congress – http://www.loc.gov/standards/
mads/);
• METS (Metadata Encoding & Transmission Standard) – Structure for
encoding descriptive, administrative, and structural metadata (developed by
Library of Congress -http://www.loc.gov/mets/);
23
Library Automation • PREMIS (Preservation Metadata) – A data dictionary and supporting XML
schemas for core preservation metadata needed to support the long-term
preservation of digital materials. (developed by Library of Congress – http:/
/www.loc.gov/standards/premis);
• SRU/SRW (Search and Retrieve URL/Web Service) – Web services for search
and retrieval based on Z39.50 (developed by Library of Congress - semantics
http://www.loc.gov/standards/sru/); and
• OAI/PMH Version 2.0 – Open Archive Initiative/Protocol for Metadata
Harvesting (developed by Open Archive Initiative).
1.4.2 Software
You already know that library management software forms the core part of
integrated library automation. You also know what are the prerequisites for an
ILS, what are the standards that need to be supported by ILS, and how procedural
model of library automation is guiding development of ILS all over the world.
The rapid development in utility of hardware, software and connectivity along
with the reduced costs paved the path for integrated library automation systems.
Current library automation software also known as Library Management Software
(LMSs) are integrated systems of a set of related modules responsible for the
management of different operational subsystems. These LMSs are based on
relational database architecture. Most of the LMSs are presently based on
procedural model of library automation and follow a modular approach to perform
the tasks related to housekeeping operations. Generally, the whole package is
divided in modules for each operational subsystem. Modules are divided into
sub modules and each sub module supports various facilities to carry out tasks
related to the procedures.
Cataloguing
• Standard formats support
• Authority control (in MARC 21 authority format)
• Integration with Linked Open Data (LOD)
• Unicode-compliant multilingual data processing
• Retrieval with sophisticated search operators
• Integration with virtual keyboard for multilingual searching
• Shared cataloguing
• Z39.50 based copy cataloguing
• Output generation and holdings information
• User services (interactive and participative).
Access Services
• Online access
• Public access interface (OPAC)
• Web access and Remote access
• Social-network enabled OPAC
• Gateway services.
Circulation Control
• Setting of user privileges
• Circulation rules
• Issue, return and renewal
• Reservation (user-driven)
• Fine calculation
• User management
• Reminders and recalls
• Enquiries (about item, borrower, reservation)
• Reminders and notices
• Reports and statistics and patron self services.
25
Library Automation Serials Control
• Order placement and renewal of subscriptions
• Kardex management
• Receiving and claiming
• Binding control
• Fund accounting
• Cataloguing of serials
• Enquiries (arrival of serials issues)
• Reports and statistics.
MIS
• Reports and statistics
• Analysis of statistics
• Usage statistics (compliant with COUNTER).
Outreach Services
• Community information services
• Social-networking support
• Library blog
• Online help for users.
System Administration
• Privileges control
• Branch management
• Backup and restoration
• System configuration.
A library may procure commercially available ILS or may opt for implementing
an open source ILS. But the above-mentioned basic tasks of an ILS are common
to all types of ILSs or LMSs.
26
Self Check Exercises Introduction to Library
Automation
Note: i) Write your answers in the space given below.
ii) Check your answers with the answers given at the end of this Unit.
11) What is a standard? Why an ILS should support global standards? List the
standards required for a globally competitive ILS.
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
12) Discuss the typical tasks performed by an integrated library system.
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
Time frame for completion of steps needs to bet set and follow strictly to achieve
targets. David (2001) suggested a time frame for steps to provide standard length
of time need to complete each stage of the process. Table 1.3 is an illustration of
the time frame developed by Davis (2001) for the RFP and selection processes.
Table 1.3: Time frame for steps in RFP development (source: David, 2001)
Steps Month 1 Month 2 Month 3 Month 4 Month 5+
Needs assessment ×
Studying available ILS ×
Listing potential vendors
of the ILS ×
Specifying needs ×
Specifying criteria for
evaluation
Developing a timeframe ×
Writing the RFP ×
32
Introduction to Library
Submitting to legal office Automation
for comment ×
Rewriting according to the
specifications of legal office ×
Submitting to vendors ×
Receiving proposals from
vendors ×
Evaluating proposals ×
Preparing a short list of
vendors ×
Requesting for a demo of
the system ×
Selecting your system ×
Preparing the contract ×
Implementing the system ×
Evaluating the implemented
system ×
i) using ILS available in remote server through web browser without any
installation;
ii) hosting the Web-OPAC and staff interfaces in remote server without burden
of local management of server and arrangement of IP address and domain
name;
iii) setting up own remote file storage and database system (with scheduled
backups).
The cloud computing mainly supports three facilities. These are Infrastructure
as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS).
The Cloud based library automation has following advantages:
ii) Virtualisation (libraries do not have to care about the physical management
of hardware, software, user interface, data backup and hardware
compatibility);
vi) Metered billing (library will be charged for only what they use).
As a whole cloud-based library automation is quite useful and cost effective for
small and medium sized libraries. Large-scale libraries may offer datasets on the
cloud for use by small libraries (Data as a Service (DaaS)). Some of the well-
known cloud-based services are listed in Table 1.4 for your ready reference.
The major cloud service providers and related services are listed in Table 1.5.
35
Library Automation 3) Linked Open Data (LOD)
Linked Open Data (LOD) refers to publishing and connecting structured data on
the Web for use in public domain. The three Key technologies that support LOD
are: URI (Uniform Resource Identifier, a generic means to identify entities or
concepts in the web), HTTP (Hypertext Transfer Protocol, a simple yet universal
mechanism for retrieving resources, or descriptions of resources over the web),
and RDF (Resource Description Framework, a generic graphical data model to
structure and link data that describes things in the web). Linked Open Data (LOD)
has two basic purposes:
i) publish and link structured data on the Web; and
ii) create a single globally connected data space based on the web architecture.
Tim Berners-Lee advocated four rules for converting dataset to LOD. These are:
1) Use URIs as names for things;
2) Use HTTP URIs so that people can look up those names;
3) When someone looks up a URI, provide useful information, using the
standards (RDF, SPARQL); and
4) Include links to other URIs, so that they can discover more things.
W3C established Library Linked Data Incubator Group in 2011 “to help increase
global interoperability of library data on the Web, by bringing together people
involved in Semantic Web activities — focusing on Linked Data — in the library
community and beyond, building on existing initiatives, and identifying
collaboration tracks for the future.” Libraries may utilise bibliographic data,
authority data, classification schemes, vocabulary control devices etc. available
as LOD for enriching existing library services and for introducing new information
services. Some major examples of library LOD are – AGROVOC multilingual
structured and controlled vocabulary, British National Bibliography (BNB)
published as Linked Data, VIAF, LCSH, LC Name Authority File (NAF) provides
authoritative data, MARC country, and language codes, Dewey.info etc. ILSs
are taking advantages of integrating LOD available in library domain through
appropriate APIs. For example, the cataloguing module of Koha can be linked
with VIAF (Virtual Internet Authority File – a linked dataset of authority data
from 21 major national libraries of the world) for getting authority data
automatically to control name authority in local library catalogue.
6) Information mashup
Information mashups tools allow remixing of data, technologies or services from
different online sources to create new hybrid services (O’Reilly, 2005) through
lightweight application programming interface (API). ILS uses information
mashup in managing and integrating virtual contents distributed globally with
local library resources. Information mashups are becoming popular application
of Web 2.0 around the world such as KohaZon (integration of Koha OPAC with
Amazon services), WikiBios (a mashup where user can create on-line biographies
of each other in a Wiki setup), LibraryLookup (integration of Google maps with
library directory service in UK) and many more such instances.
37
Library Automation
38
rich user experiences in terms of speed, relevance, and ability to interact Introduction to Library
Automation
consistently with results. Moreover, the unified interface is a big boost for users
as they no longer need to choose a specific search tool to begin their search.
These tools are available commercially (e.g. EBSCO Discovery Service) and
also as open source products (such as VuFind, SOPAC, Blacklight, OpenBib
etc.).
39
Library Automation 13) Emergence of open standards
Open standards are available in public domain. These are the standards that anyone
can incorporate into their software, service and system. MARC record standard
is possibly the most visible open standard in the domain of library services.
Library systems of any type or size are required to be compatible with global
standards to achieve interoperability. Here lies the importance of open standards.
These are developed, approved and maintained via collaborative process to
facilitate exchange of datasets. These standards are available at no cost, well-
documented, transparent and free from any kind of use restriction. ILSs are
increasingly depending on open standards such as MARC 21 family of standards
(Five standards), OAI/PMH, CCL (Common Command Language), SING, Dublin
Core metadata standard, SRU, SRW, OpenURL, MARC-XML, METS, MODS
etc.
1.8 SUMMARY
Library automation is an area from where future benefits will emerge. It means
that if a library is not automated it won’t be in a position to take the advantages
of ICT-enabled library services in future. This Unit acts as foundation and aims
to introduce you with the concept of integrated library system and the advantages
associated with it. It covers historical and theoretical foundations of library
automation supported by a timeline of development of related technologies. In
this Unit you can find guidance – 1) to identify the requirements for library
automation; 2) to follow model for integrated library system; 3) to differentiate
automated and digital library system; 4) to understand the typical steps for
accomplishment of library automation; 5) to appreciate needs for standards in
ILS and to recognise essential standards that need to be ensured; 6) to identify
features of ILS in rapidly changing technological environment. This unit also
provides knowledge about emerging global recommendations for developing
ILS in the context of cutting edge technologies like cloud computing, linked
open data and web scale library management. It also covers roles and components
of RFP and steps for developing RFP for library automation, and allows you to
develop skills in preparing RFP. This unit ends with a brief discussion on
forthcoming features and ongoing changes in the arena of ILS against a fifteen-
point checklist.
2) An ILS is capable of managing the operations of more than one basic library
functions by sharing the files in the server to perform them. For example
data from the book catalog master file and the patron master file can be
retrieved and used in the circulation module to perform the circulation
function of the ILS. In such systems files are interlinked so that deletion,
addition and other changes in one file automatically activate changes in
related files. It means integrated library management system is sharing a
common database to perform all the basic functions of a library.
41
Library Automation 3) Library automation is a generic term that refers to the application of
computers in libraries to automate operations. It can be standalone system
supporting only one module like cataloguing or it can be integrated to link
all modules or library subsystems through a common shared database. On
the other hand, ILS is an automated library system that utilises shared data
and files to provide interoperability of multiple library functions, e.g.
cataloging, acquisition, circulation, serials, etc.
42
9) Procedural model of library automation is proposed by ASLIB (Association Introduction to Library
Automation
of Information Managers, UK) as a general model for automating library
housekeeping operations. Presently most of the ILSs follow this model for
designing different functional modules of ILSs. The model proposes that a
library system has mainly two subsystems – administrative subsystem and
operational subsystem (amenable for automation). The operational subsystem
may be divided into four further subdivisions namely Acquisition,
Processing, Use and Maintenance. Within each of these divisions there are
a number of procedures (eighteen in total) and within each procedure there
are one or more of six possible activities. The procedures and activities are
carried out by fifteen basic tasks.
10) Digital libraries are managed collection of digital objects that provide full-
text access to resources and differ significantly from automated library
systems in terms of – 1) search features (metadata only vs. full-text and
metadata); 2) document description (MARC 21 vs. Dublin Core); 3)
interoperability standards (Z39.50 vs. OAI/PMH); and 4) software
architecture (centralised vs. distributed).
13) Designing of future friendly ILS requires guidelines. OLE project and ILS-
DI recommendations are acting as such guidelines recognised globally. The
principal aim of OLE project is cost-effective integration of library
management with other institutional systems on the basis of Enterprise
Resource Planning (ERP) enabled Abstract Reference Model. On the other
hand, ILS-DI guides developers in – 1) Data aggregation (harvesting and
distributed searching); 2) Search (simple and advance search operators); 3)
Patron services (general and interactive interfaces); and 4) Integrated service
framework (on-the-fly integration of open contents, data sets etc.).
14) A request for proposal (RFP) is a formal request for a bid from suppliers of
library systems or third-party software vendor in case of open source
43
Library Automation software. RFPs are aiming to determine library requirements, prescribing
standards and demanding services from ILS vendors and developers. The
RFP prescribes the resources that need to be acquired, the services that need
to be offered, the standards that need to supported, the selection criteria for
ILS, and the requirements for the software vendor including a time schedule
for each level of activities. It guides the library in evaluation of integrated
library systems and helps the library to choose and acquire the most
appropriate system.
16) L. T. David in 2001 advocated a set of steps for developing RFP. The process
starts with need assessments and ends with evaluation of implemented
system. It includes a total of eighteen steps.
18) Cloud-based library automation is quite useful and cost effective for small
and medium sized libraries. Cloud computing is network based computing
facilities that support on-demand use of hardware and software resources.
Libraries can take advantages of cloud computing in the following ways –
i) by using ILS available in remote server through web browser; ii) by hosting
the Web-OPAC in remote server; iii) by setting up own remote file storage
and database system (with scheduled backups).
1.10 KEYWORDS
Acquisition : The process of obtaining resources for the library’s
collection, typically including ordering, receiving and
payment.
API : Application Programming Interface. A language and
message format used by an application program to
communicate with the operating system or some other
control program such as a database management
system (DBMS).
Authority record : A record that shows the preferred form of a personal
or corporate name, geographic region or subject. It also
includes variant forms of the preferred form as cross
references.
44
Barcode : A printed code, consisting of lines and spaces that can Introduction to Library
Automation
be read by a bar code scanner (reader), affixed to
physical materials in a library collection to identify
particular items for tracking and circulation.
Bibliographic identifier: A unique identifier which unambiguously identifies a
bibliographic record within an ILS catalog and is
assumed to persistent, at least as long as the records
are managed within the ILS.
Bibliographic metadata: Information about a resource that serves the purpose
of discovery, identification and selection of the
resource. Includes elements such as title, author,
subjects, etc.
Discovery application: A computer application designed to simplify, assist
and expedite the process of finding information
resources.
Dublin Core : A fifteen element metadata set for use in resource
description intended to facilitate discovery of
electronic resources.
EDI : Electronic Data Interchange (EDI) is a standard method
for exchanging structured data, such as purchase orders
and invoices, between computers to enable automated
transactions.
EDIFACT : EDI For Administrations, Commerce and Transport
The concept of utilising a single set of specifications
for bibliographic records regardless of the type of
material they represent.
ERMS : Electronic Resources Management System is used to
manage a library’s electronic resources, primarily e-
journals and databases. Systems can include features
to track trials, license terms and conditions, usage, cost,
and access.
FRBR : Functional Requirement for Bibliographic Records is
a conceptual model for the aggregation and display of
bibliographic records. FRBR is an entity-relationship
model, with four primary entities - work, expression,
manifestation, and item - which represent the products
of intellectual or artistic endeavor.
ILL : Inter Library Loan (ILL) is the process between two
libraries of borrowing and lending a physical
bibliographic item, or obtaining a copy of it.
ILS : An automated library system that utilises shared data
and files to provide interoperability of multiple library
functions, e.g. cataloging, acquisition, circulation,
serials, etc.
45
Library Automation Interoperability : The ability for two different computer systems to
communicate and exchange information in a useful
and meaningful manner.
LAN : A digital communication system capable of
interconnecting a large number of computers, terminals
and other peripheral devices within a limited
geographical area.
Library Automation: Library automation is the mechanisation of
housekeeping operations and information handling
mainly by using computer and communication
technologies.
MARC 21 : A harmonised MARC format developed by LoC in
1999 for encoding standards related to bibliographic
data, authority data, holdings data, classification data
and community information. It is used for the
communication and exchange of bibliographic
information (mentioned earlier) between computer
systems.
MARCXML : A metadata scheme for working with MARC data in a
XML environment.
Metadata : Structured information that describes an information
resource. “Data about data” for an information bearing
object for purposes of description, administration, legal
requirements, technical functionality, use and usage,
and preservation.
Metadata harvesting: A technique for extraction of metadata from individual
repositories for collection into a central catalog.
Module of ILS : Functions specific to a particular system capability
such as the online public access catalog, cataloging,
acquisitions, serials, circulation, etc.
NCIP : NISO Circulation Interchange Protocol (NCIP) is a
standard which defines a protocol for the exchange of
messages between and among computer-based
application to enable them to perform functions
necessary to lend and borrow items, to provide
controlled access to electronic resources, and to facilitate
co-operative management of these functions.
Network : A group of computers and other devices connected
together so that they can communicate with each other,
share data and resources such as printers, and perhaps
share the workload of running complex programs.
They may have one or more central servers to
coordinate and run things, or all devices may be of
equal standing (called “peer-to-peer”). The
connections between them may be physical wires and
cables, or wireless using infrared or radio frequency.
46
OAI-PMH : OAI - Protocol for Metadata Harvesting. Protocol for Introduction to Library
Automation
application-independent interoperability framework
based on metadata harvesting, open standards HTTP
(Hypertext Transport Protocol) and XML (Extensible
Markup Language).
OPAC : On-line Public Access Catalog is a library catalog
which can be searched on-line and is a module of the
ILS. It is the interface between library resources and
users and is designed to be “user friendly.”
Open Source : A concept through which programming code is made
available through a license that supports the users
freely copying the code, making changes it, and sharing
the results. Changes are typically submitted to a group
managing the open source product for possible
incorporation into the official version. Development
and support is handled cooperatively by a group of
distributed programmers, usually on a volunteer basis.
Open Search : A collection of technologies developed by Amason
that allow publishing of search results in a format
suitable for syndication and aggregation.
Open URL : A URL with stored metadata that is user context
sensitive in what information or hypertext link is
delivered.
Protocol : A standard procedure for the message formats and rules
that two computer systems must follow to
communicate with each other.
RSS : Really Simple Syndication is an XML format used for
distribution or syndication of frequently updated Web
contents.
SIP2 : Standard Interface Protocol Version 2 is a standard for
the exchange of circulation data and transactions
between different systems.
SRU : Search/Retrieve via URL is a standard search protocol
for Internet search queries, utilising CQL (Common
Query Language), standard query syntax for
representing queries.
SRW : Search/Retrieve Web service is web services
implementation of the Z39.50 protocol that specifies
a client/server-based protocol for searching and
retrieving information from remote databases.
System Analysis : A powerful technique for the analysis of an
organisation and its work.
Unicode : A universal character-encoding standard used for
representation of text for computer processing.
Unicode provides a unique numeric code (a code point)
47
Library Automation for every character, no matter what the platform, no
matter what the program, no matter what the language.
The standard was developed by the Unicode
Consortium in 1999.
WAN : A computer networking system that operates
nationwide or worldwide by utilising telephone line,
microwave and satellite links. It is also used to
interconnect LANs.
Web Service : Software system designed to support interoperable
machine to machine exchange of data/information,
typically using the XML, SOAP, WSDL and UDDI
open standards.
XML : eXtensible Markup Language is an open standard for
describing data from the World Wide Web Consortium.
It is used for defining data elements on a Web page,
business-to business documents, and other
hierarchically structured text and data.
Z39.50 : A NISO and ISO standard protocol that specifies a
client/server-based protocol for cross-system searching
and retrieving information from remote databases. It
specifies procedures and structures for a client system
to search a database provided by a server.
Cohn, John M. & Kelsey, Ann L and Fiels, Keith Michael. Planning for
automation: a how-to-do-it manual for librarians. New York: Neal-Schuman,
1992. Print
Dula, M., Jacobsen, L., Ferguson, T. and Ross, R. Implementing a new cloud
computing library management service. In Computers in Libraries, 32.1(2012),
pp. 6-40.
Duval, B.K. and Main, L. Automated library systems: a librarian’s guide and
teaching manual. Westport, USA: Meckler, 1992. Print
48
Haravu, L. J. Library automation: design, principles and practices. New Delhi: Introduction to Library
Automation
Allied Publishers Private Limited, 2004. Print
Hodgson, Cynthia. The RFP writer’s guide to standards for library systems.
National Information Standards Organisation: Bethesda, Maryland, 2002. < http:/
/www.niso.org>
Kuali Foundation. Kuali Open Library Environment: test drive OLE version
0.6. (2012). <http://demo.ole.kuali.org/ole-demo/portal.jsp>
Müller, T. How to choose a free and open source integrated library system. OCLC
Systems & Services, 27.1(2011), pp 57-78. <doi:10.1108/10650751111106573>
Swan, James. Automating Small Libraries. Ft. Atkinson, Wis.: Highsmith Press,
1996. Print
49
Library Automation
UNIT 2 LIBRARY AUTOMATION
PROCESSES
Structure
2.0 Objectives
2.1 Introduction
2.2 Library Workflow: System Approach
2.2.1 Subsystems and Workflows
2.2.2 Analysis of Tasks
2.2.3 Automation of Workflow
2.3 Acquisition Subsystem in ILS
2.3.1 Functional Requirements for Acquisition in ILS
2.3.2 Workflow of Automated Acquisition
2.3.3 Products and Advantages
2.4 Document Processing Subsystem in ILS
2.4.1 Functional Requirements for Document Processing in ILS
2.4.2 Workflow of Automated Document Processing
2.4.3 Products and Advantages
2.5 Serials Control Subsystem in ILS
2.5.1 Functional Requirements for Serials Control in ILS
2.5.2 Workflow of Automated Serials Control
2.5.3 Products and Advantages
2.6 Circulation Subsystem in ILS
2.6.1 Functional Requirements for Circulation in ILS
2.6.2 Workflow of Automated Circulation
2.6.3 Products and Advantages
2.7 System Administration
2.8 Summary
2.9 Answers to Self Check Exercises
2.10 Keywords
2.11 References and Further Reading
2.0 OBJECTIVES
After going through this Unit, you will be able to:
• understand typical workflows of library subsystems amenable for automation;
• know how to analyse housekeeping operations systematically;
• identify the requirements, processes and advantages of automating library
workflow; and
• realise issues related to administration of library automation processes.
50
Library Automation
2.1 INTRODUCTION Processes
You already know what and why of library automation from Unit 1. This Unit
aims to introduce you with the processes related to library automation in an
integrated environment. You can also see here the application of procedural model
of library automation in analysing tasks related to different subsystems of a library.
One of the major objectives of library automation is to automate the regular
workflow of library system i.e. library housekeeping operations. An ILS performs
library housekeeping operation through software modules integrated seamlessly.
These modules are also called subsystems under ILS. A typical ILS includes
acquisition subsystem, document processing subsystem, serials control subsystems
and circulation subsystem as core modules. The other managerial activities like
export/import, backup/restoration, parameters setting, configuration settings etc.
are performed through administrative module.
Order
This procedure starts with pre-order searching, especially to avoid duplicate
orders. In the next stage purchase orders are generated and placed either
directly to the respective publishers or to the listed vendors/book sellers.
Additionally, generation of reminders for overdue items and cancellation of
orders also comes under the purview of ordering procedure.
Receive
Documents and invoices or bills usually arrive together. Bills are checked
with the order list before processing for payment. Newly arrived books are
tallied with the bills and the order list to check the author, title, edition,
imprints and price before accessioning.
Accession
A stock register is maintained by libraries in which all the documents
purchased or received in exchange or as gift are entered. Each document is
provided with a consecutive serial number. The register is called Accession
register and the serial number of the document is referred as Accession
Number.
B) Processing Subsystem
The processing procedure is the pivot round which all the housekeeping
operations revolve in a library. It helps in the transformation of a library
collection into serviceable resources. The procedures under this subdivision
are classification, cataloguing, labeling and shelving.
Catalogue
Cataloguing is the prime method of providing access to the collection.
Cataloguing procedure starts with technical reading of the document to be
catalogued by studying title, sub-title, alternate title, author, editor, edition,
reprint, imprint, dedication, preface, table of contents, collation, series,
bibliographies etc. In case of manual cataloguing, the cataloguer makes
separate cards for author, title, subject, cross-references and analytical entries
by following any standard catalogue code (such as AACR II, CCC etc.) and
file them as per the rules laid down by the library. Computerised cataloguing
begins with entering bibliographical data in a pre-designed worksheet. The
worksheet or data sheet is very similar to data entry form and is based on
any standard content designators scheme (such as MARC 21 Bibliographic
Format, CCF/B, UNIMARC etc.). Finally bibliographical data recorded in
the worksheets are entered into the computer to produce machine-readable
catalogue file and OPAC. Computer-based cataloguing supports importing
of bibliographical datasets for the library resources either from centralised
cataloguing services or from other libraries and exporting of bibliographical
data of its own collection to other library systems. This facility reduces unit
cost of cataloguing and ensures standardisation in cataloguing. The recent
trend of cataloguing is to utilise Z39.50 protocol to download bibliographical
data from other libraries and to provide global access to its own collection
through Web-OPAC.
Label
Shelve
Shelving is the arrangement of documents on the shelves to fulfill the fourth
law of library science – Save time of the reader. Generally books are arranged
on the shelves in a classified manner as per the call number. Bound
periodicals are generally shelved alphabetically by title and then by volume
numbers. Although shelving works are generally manual in nature, RFID-
enabled ILS helps in identifying misplaced documents in shelves and thereby
supports stock rectification.
C) Circulation Subsystem
Circulation service is quite common to libraries of different types. Most
libraries lend books and other library materials to be read elsewhere by
users. This is convenient for the users, increases the use made of libraries’
collection and reduces demand for reading space within library building.
This function requires some sort of record keeping arrangement of what has
been lent and to whom. There are two good reasons for keeping loan records:
i) to reduce the loss of library materials; and ii) to help library staff to answer
users’ queries about the location of items not on the shelves.
Where from? Bibliographies, Competent Book Selection Order form/ Order File/
Index, Authority Tools, MIS Order letter Computer
Requisition, Database
Suggestions
After Select Before After After After
When?
Procedure Activation Authorisation Activation Activation
Library Asst./ Librarian/ Library Asst./ Library Asst./ Library
Who? Technical Section-In- Technical Library Asst.
Asst. Charge Asst clerk
Receiving Enter Signature Enter data/ Filing the Copy Deletion from
How? copy of information on of Order form/ Database/
Bibliographies, Order form/ Saving in Removal from
Suggestion slip Computer Computer File
Database and
Generate Order
56
The analysis of tasks to perform activities within procedures may be done through Library Automation
Processes
a set of five primary questions: What information is needed for the activity?
Where is the information obtained? When is it required? Who requires it? How
is it used? These five questions should be asked to carry out possible activities
under each procedure (see Table 2.2). It provides depth to the framework provided
by the procedural model. An example of this approach may be shown (in Table
2.2) in the context of five possible activities of book order procedure in acquisition
subsystem.
The ILS also provides a discovery interface (commonly known as the Online
Public Access Catalog or “OPAC”) that enables patrons to search for resources.
OPAC includes simple and advanced search interfaces with supports for member
login (to check reading history, borrowed books, fines, suggestions etc.). Most
of the ILSs now provide Web-OPAC (accessible through web browser) and these
are now compatible with social networking tools (such as facebook, twitter etc.)
and information mashup to integrate external datasets (like book cover image,
book reviews etc.) with local library materials. (see Fig. 2.2).
57
Library Automation
Fig. 2.2: End user interface in Koha with social networking tools
The next sections discuss three groups of activities related to acquisition. These
are – pre-acquisition work, acquisition work and generation of outputs.
• Budget division
Sometimes it is necessary to divide a budget head into several sub-heads
(e.g. a book procurement head may further be subdivided into reference
books and text books). This step allows a user to divide the budget into
sub-heads or even divide the budget sub-heads further.
• Acquisition Works
Actual acquisition work starts after completion of pre-acquisition works.
The flow of acquisition works for document procurement in computerised
libraries irrespective of type or size may be divided into four logically
related groups – 1) Document related work; 2) Order processing; 3)
Accessioning; and 4) Payment.
Group I tasks
Acquisition work starts with collection of information related to documents to
be procured. Library staff initiates acquisition with entering bibliographical
information and information about requesters from the suggestion slips and books
submitted by the suppliers on approval. Bibliographical data given by the
requesters in suggestion slips require to be verified by consulting book selection
tools. The online databases of virtual bookstores (like Amazon or BookFinder)
may also be utilised for checking bibliographical information of recently published
documents. Bibliographical details of documents received by libraries in ex-
gratis are also entered into the database. A library normally receives a large number
of suggestions and documents for ordering. Library staff shortlist these requests
depending on need, availability of fund etc. by clicking the appropriate option(s)
available in the package. Finally a report is generated for all the short-listed
suggestions and documents indicating number of copies required, budget code,
budget head and unit price of the items requested. The library committee approves
the list officially and on the basis of the final approval list library staff either
select or reject the short listed titles. Books on direct approval and gratis items
do not have to go through approval process from library committee or any such
authoritative body.
Group IV tasks
The work of this group starts with the processing of invoices submitted by the
suppliers along with the documents by entering necessary elements into the
database. Release of payment is the next step in which letters/reports containing
all the necessary administrative and financial details are generated against supplier
or order number or invoice number for requesting appropriate authority (generally
Finance Section) to release payment to the supplier. After release of payment,
the financial details of payment are entered and stored into the database.
ILS
In automated document processing environment, resource description or
cataloguing is possibly the most important task of library automation work. It
requires standardisation and should be supported by carefully crafted decision
table(s). The cataloguing module of ILS gives us freedom to choose MARC
standards (UNIMARC and MARC 21) or Non-MARC standards (like Common
Communication Format or your own standard). However, MARC 21
bibliographic format is now considered as the global de facto standard. MARC
21 family of standards (a family of five coordinated standards such as
bibliographic standard, authority standard, community information standard
holding format and classification format) are now selected as content designator
in most of the ILSs. There are two reasons for it. First, MARC 21 standards are
updated continuously, available through Web, and emerging as open standards.
Secondly, these are now becoming almost the de facto global standards in the
domain of library automation as these are adopted by the national libraries in
different parts of the world. Cataloguing module of an ILS should also be
supported by an array of internationally agreed upon standards and facilities like
– FRBR, FRAD, pickup lists, authorised value lists, standard lists, export-import
through ISO-2709 or MARC-XML etc. This section discusses automated
document processing subsystem under three major heads – 1) Functional
requirements, 2) Workflow, and 3) Advantages and products.
Fig. 2.5: MARC 21 authority data entry framework (name authority) in Koha ILS
66
Field Description Type of Support Library Automation
Processes
Leader 24 characters fixed-length field Pickup list for character positions
005 Date and time of latest transaction Automated entry of date and time
from system
006 Books – (00-17) – Fixed-length field Pickup list for character positions
007 Text - (00-01) Pickup list for character positions
008 Fixed-length data elements Pickup list for character positions
040 Cataloguing Source Pickup of library code (as per MARC)
041 Language Code Code list support (as per MARC)
Fig. 2.6: Support to manage Leader field (24 character positions) in Koha ILS
67
Library Automation • OPAC should also support bulletin board, information desk and gateway
services (to access external databases) along with patron self-service options
(e.g. holds, renewals etc.); and
• OPAC must track users’ preference and interests, organised into a list of
favourities and support interactive, participative and collaborative platform
through web 2.0 tools like RSS, social networking tools, user tagging,
document rating etc.
Distributed cataloguing
• Must be Z39.50 complaint cataloguing system [ANSI/NISO Z39.50 (1995)
or ISO 239.50 (1998)];
• Should enable to capture bibliographic and authority records from any Z39.50
server through Z39.50 client; and
• Should allow local manipulation (change of call number etc) of captured
data.
2) Union catalogue
Library networks at the global level (like OCLC, RLN) and national level
(like INFLIBNET and DELNET in India) provides union catalogue of
member libraries in machine readable form. Union files of the stock of
several libraries, or another shared database may be imported, converted
into local standard format and finally merged into the catalogue database.
69
Library Automation 3) Commercially available files of MARC records
In this process records from external databases may be added from tape, or
by downloading directly from the files through network. A further option is
to acquire records on CDROM or DVDROM and to download records from
optical media. For example Harvard University, US recently uploaded all
bibliographic records in MARC 21 format (2 million book records) for other
libraries.
4) Z39.50 server
Computerised cataloguing provides a unique advantage of loading and
merging of bibliographic and authority records from external databases.
There are thousands of Z39.50 servers from where selective downloading
of validated bibliographic data may be done at the local level (see Fig. 7).
This feature of an automated system leads to a reduction in cataloguing
effort and a consequent saving in the unit cost of cataloguing. This mode of
shared cataloguing is popularly termed as copy cataloguing and implemented
in ILSs through Z39.50 standard developed by ANSI/NISO.
72
......................................................................................................................
7) Discuss the MARC 21 family of standards. Library Automation
Processes
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
In serials control module of an ILS, master databases play important role. Any
number of addition, modification and deletion is possible in the master database
and these changes are automatically reflected in all the sub-modules under that
module. It reduces data entry work and ensures standardisation. A typical serials
control module includes:
74
Title master Library Automation
Processes
In this file bibliographical details of new serials are entered (on the basis of
standard comprehensive data format like MARC 21 bibliographic format) after
the selection and approval process.
Country master
This file contains name of countries and their corresponding codes for entering
country of publication data in sub-modules of serials control. Country code is
generally based on ISO-3166 where each country is represented by two unique
characters e.g. the code of India is in as per ISO-3166.
Language master
Now in most of the cases MARC 21 geographic area code (GAC) is used for the
purpose. But this file may also contain entries for languages and their three digit
codes as per the ISDS manual and CCF manual.
Supplier/Publisher/Binder master
This master file contains details of all local and foreign subscription agents,
publisher of serials and binders along with their corresponding codes. These
codes are generally created locally.
The above mentioned master files are essential and the other important master
tables are – 1) Subject master (holds lists of subject descriptors); 2) Frequency
master (holds codes for serials frequencies); 3) Budget master (holds financial
data necessary for serials acquisition); 4) Currency master (contains currency
description, codes and exchange rate for foreign currencies); 5) Delivering mode
master (contains different modes of delivery of serials by publishers and vendors);
6) Physical media master (holds forms, formats and media for serials in coded
form); 7) Binding type master (contains different modes of binding (e.g. standard,
lather binding, cloth and rexin binding etc.) and their corresponding codes); 8)
Letter master (includes formats for every type of letters required for the generation
of outputs such as order letter, cancellation of order letter, reminder letters etc.).
75
Library Automation All together, there are 12 basic works in this group of works related to serials
control given in the sequence – 1) Selection of serials for new subscription; 2)
Renewal or discontinuation of existing journals/serials; 3) Selection of delivery
mode; 4) Selection of subscription mode; 5) Formulation of terms of procurement;
6) Selection of vendors; 7) Approval from authority; 8) Ordering and renewal;
9) Payment; 10) Receiving and registration; 11) Reminder generation; and 12)
Adjustment of advance payment for non-receipted issues.
Article indexing
Article indexing option is generally requires by libraries in research institutes.
Indexing of articles (also called papers) from journal issues is an optional facility
of serials control subsystem. Generally, publishers of primary periodicals produce
annual and other sorts of indexes regularly. Apart from such products, libraries
also subscribe to number of indexing and abstracting journals related to the areas
of their interest. As a result, article indexing is only necessary when available
indexing and abstracting services do not cover the core journals on discipline of
interest.
Leader 24 characters fixed-length field
00X group Control Fields
005 Date and time of latest transaction (NR)
006 Serials – (00-17) – Fixed-length field (R)
008 Fixed-length data elements – General information (NR)
Table 4: Data elements (minimum) for serials on the basis of MARC 21 bibliographic
format (R=Repeatable field and NR= Non-repeatable fields)
Group IV: Circulation and Binding
This group includes following jobs –
Circulation
Circulation of serials is often referred as Routing of journals. Circulation pattern
of serials differs largely from that of books. But if serials are available for ordinary
loan, then the same circulation control system will suffice as for monographs.
However, serials are generally reserved for reference use only. In special libraries,
the short time loan options for journals are common because of the specific need
of users. If the number of transactions per day is large enough then such transaction
system may be computerised. Such computerised facility must have a list of
serials taken, a list of users and their addresses, and transaction interface with
options for the generation of required output.
Binding
Back volume management is an important job in serials control. It is a valuable
feature of computer based serials control subsystems to inform the library staff
of volumes that have been completed and are now ready for binding. It is a very
helpful feature to assist in work scheduling and to spread the binding load to
give an even distribution of work in the binding throughout the year. After binding
of back volume of a journal, accessioning is done for the bounded volume and
then holding information for the concerned journal is changed / modified in the
bibliographic database of journals.
77
Library Automation 2.5.3 Products and Advantages
The output of products of an automated serials control subsystem may be grouped
into three basic categories – OPAC (gives search option for journals, journal
articles and journal holdings), Reports and lists (provides status reports and MIS
reports for decision making) and information products (such as table-of-contents
and other altering services including SDI). OPAC of an ILS allows searching
serials by Title (Current title, Complete holdings, Key title, Linked title, Variant
title), Subject (Broad subject heading, Subject divisions, descriptors and class
number), Publisher, Title history (Title split, Title merge, Title change, Title
holdings), ISSN and Free text. Several reports, letters and statistics can be
generated by the automated serials control system such as List of suggestions,
List of approved titles, List of titles ordered, List of issues received, List of non-
receipted issues, List of missing issues etc. In serials control module of an ILS
information products are originated either from article indexing activities or serials
catalogue database and produced on demand such as List of recent arrival for
issues of a group of journals (as selected by users), List of journal available on a
particular discipline, Discipline-wise holding list of serials, Table of contents
service of a group of journals (as per user selection), Compilation of on demand
subject bibliographies, CAS and SDI services in online and offline mode etc.
78
......................................................................................................................
9) What is a predictive mode of serials control? Discuss its advantages in library Library Automation
Processes
automation?
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
The other broad groups of activities for the workflow of automated circulation
are:
Membership Management
This sub-module is basically meant to crate and update membership records in a
library. The works of this sub-module are – 1) Master database creation and
maintenance facility; 2) Member category and privileges management; 3) Institute
80
profile and profiles of Departments/Divisions under the institute; 4) Calendar to Library Automation
Processes
record weekdays and closed days for library; 5) Member enrollment facility
including modification/deletion/renewal of membership; 6) Output generation
facility.
Transaction Management
Transaction sub-module includes all the day-to-day activities of circulation section
of a library vis. issue, return, renewal, reservation, reminders for overdue books,
searching document availability and listing of items issued to a member.
Reminder Generation
This facility is meant for generating reminders for overdue documents – To a
group of members, To individual members, For a particular due date, To all
members. The format and text of reminder letter may be modified by using this
facility or by using the master database.
Fiscal Management
It provides option to manage outstanding dues against a member. It also includes
generation of payment receipt. Fine amount may be waiver by authorised staff.
This facility should also allow printing of fine statement if a member wants to
have a statement of fines.
Maintenance
Maintenance is generally attached with circulation module for recording
information about lost documents, documents sent for binding, damaged
documents, missing documents and documents withdrawn from library.
.......................................................................................................................
11) Explain the use of RFID in automated circulation.
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
2.8 SUMMARY
This Unit starts with a theoretical discussion on system analysis and shows the
application of procedural model to analyse tasks related to housekeeping
operations under different sections of a library. It discusses library automation
processes in integrated setup under four major subsystems namely acquisition
subsystem, document processing subsystem, serials control subsystem and
circulation subsystem. Each subsystem includes three major heads of discussion
uniformly. The heads of discussion are functional requirements for the subsystem,
workflow of the subsystem and advantages of automating the subsystem including
typical products of the automated subsystem. Functional requirements section
argues what an ILS should support and workflow section discusses how an ILS
may be utilised for automating the subsystem. This unit ends with a discussion
on system administration jobs related to library automation.
4) Acquisition module of any ILS requires some essential works that need to
be done before proceeding with actual acquisition work. These are termed
as pre-acquisition works. This set of activities include – creation of master
file for vendors/publishers/suppliers, creation and maintenance of currency
conversion table, budget allocations under different heads, setting pre-defined
letters for ordering etc, member creation and privilege setting.
9) Predictive mode of serials control means the ability of the ILS to predict the
arrival of individual issue of a journal and to generate reminders
automatically in case of non-receipted issues or parts within a stated time
interval. An automated serials control subsystem may be predictive or non-
predictive. A predictive serials control system saves labour, energy, time
and money and ensures timely delivery/release of reminders for due issues
of journals.
10) Circulation work of a library involves a group of operations that are specific,
repetitive and systematic. As a result automated circulation systems have
been fairly successful from the early days of library automation. Such systems
require minimum set of essential data for carrying out circulation activities
and data may be captured in a variety of ways. In an academic library, where
users are generally large in number, this automated subsystem saves time of
the users in great way.
12) The administrator or super user should control the overall administration of
ILS through a highly secured module for managing access control - for
individual user, for each module and for each function; system security to
prevent unauthorised access to databases; standard implementation and
setting of system parameters and keep a log of each transaction, which alters
the database. The other important jobs of system administration are privileges
control, branch management, backup and restoration and System configuration.
2.10 KEYWORDS
Backup : Storage of records in magnetic or optical media for
recovery of data at the time of need.
Barcode : A barcode is simply a computer readable tag that is
used to identify individual items and patrons that are
related to a specific library database.
87
Library Automation Boolean Operators : The words AND, OR, and NOT used to combine
concepts or search terms when searching a database
for information.
Budget Allocation : It is the distribution of total library budget into various
budget heads and subheads.
Charging : It is the act of ‘issuing’ a document and to record the
loan transaction.
Check-in : The act of receiving and recording arrival of individual
parts of serials.
Common : The CCF was developed by the General Information
Communication Programme (PGI) of UNESCO in order to facilitate
Format (CCF) exchange of bibliographic data between organisations,
and first published in 1984. It is a highly compatible
format that provides a structure in which records may
be entered to the system; a format best suited to long-
term storage; a format to facilitate retrieval and a
format for display.
Data field : In a record, a meaningful collection of one or more
related characters treated as a unit. In bibliographic
records, these are variable length portion containing
a particular category of data.
Directory : A table of entries, each of which gives the tag, length,
and location within the record, segment identifier and
occurrence identifier of one data field.
Discharging : The act of cancelling the records of documents on loan
after their return.
Indicator digit : The first two characters of each data field, supplying
further information about the contents of the field.
Intranet : The network that uses Internet technologies (TCP/IP
and others) for local connectivity and is available only
to the members of the network.
ISDS : An acronym for International Serials Data System. An
international network of operational centers
(established in 1973 within the framework of UNISIST
programme), which are jointly responsible for the
creation and maintenance of computer-based
databank, and facilitates retrieval of scientific and
technical information in serials.
ISO-2709 : An international standard for bibliographic information
interchange on magnetic tape, developed in 1981.
Most of the content designator schemes constitute a
specific implementation of this standard.
ISSN : Acronym for International Standard Serial Number –
an internationally accepted code for the identification
of serials publications. It consists of seven Arabic
digits with an eighth that serves to verify the number
88 in computer processing.
Mandatory field : A data field, which should appear in the record when Library Automation
Processes
the relevant information appears on the item.
MARC 21 : MARC 21 is a family of five coordinated formats namely
MARC 21 format for authority data, bibliographic data,
classification data, community information and
holdings data. MARC 21 is a development over
USMARC, and has become the de facto bibliographic
standard in the area of computerised cataloguing.
Merging of Title : It refers to combine two or more journals into a single
journal under one title.
Record : A collection of information, in one or more fields,
about an entity.
Repeatable field : A data field, which may appear more than once in the
same segment.
Repeatable sub-field : A subfield, which may appear more than once in a
single occurrence of the data field to which it belongs.
Reservation : A request for a specific book or other circulating items
to be reserved for a member as soon as it becomes
available on completion of processing, or on its return
from the binder or another member.
Routing : The systematic circulation of periodicals or other
printed material among the staff or members of a
library in accordance with their interests in order to
keep them informed of new developments.
SDI : Abbreviation for Selective Dissemination of Information
Systems. It is an automated system of information
retrieval utilising a computer for disseminating relevant
information to users. An interest profile depicting and
defining each area of interest is compiled for each user;
it consists of terms, which are likely to appear in
relevant documents.
Splitting of Title : The breaking of a single journal into two or more
different journal titles.
Standing Order : An order to supply each succeeding issue of a serial
publication or subsequent volumes of a work published
in a number of volumes issued intermittently.
Sub-field : A separately identified part of a data field containing
a data element.
Sub-field identifier : Two characters immediately preceding and identifying
a subfield. First character is called subfield flag and
the second character is termed as subfield code.
System Analysis : A powerful technique for the analysis of an
organisation and its work.
Tag : A three characters code appearing in the directory,
associated with a data field and used to identify it.
89
Library Automation Union Catalogue : A catalogue of the various departments of a library, or
a number of libraries, indicating their locations. Union
catalogue of serials includes the complete holding of
serials available in member libraries.
Withdrawal : The process of cancelling records in respect of
documents that have been withdrawn from the stock
of a library.
90
Library Automation
UNIT 3 LIBRARY AUTOMATION – Processes
SOFTWARE PACKAGES
Structure
3.0 Objectives
3.1 Introduction
3.2 History, Evolution and Generations
3.2.1 Historical Foundation
3.2.2 Evolution
3.2.3 Generation of Packages
3.3 Categorisation of ILS
3.3.1 Categorisation by Distribution Policy
3.3.2 Categorisation by Place of Origin
3.4 Open Source Software Packages
3.4.1 Evergreen
3.4.2 Koha
3.4.3 NewGenLib
3.4.4 PMB
3.5 Commercial Software Packages
3.5.1 LibSys
3.5.2 SLIM
3.5.3 SOUL
3.5.4 Virtua ILS
3.6 Freeware ILS
3.6.1 ABCD
3.6.2 E-Granthalaya
3.6.3 WEBLIS
3.7 Evaluation of Software Packages
3.7.1 Generic Parameters of Evaluation
3.7.2 Specific Parameters of Evaluation for Commercial ILSs
3.7.3 Specific Parameters of Evaluation for Freeware and Open Source ILSs
3.8 Global Recommendations
3.9 Summary
3.10 Answers to Self Check Exercises
3.11 Keywords
3.12 References and Further Reading
3.0 OBJECTIVES
After going through this Unit, you will be able to:
• understand historical background, evolution and generation of library
automation software packages;
• categorise library automation software as per origin and distribution policies;
91
Library Automation • identify features and specialties of major commercial and open source
software packages in the domain of library automation; and
• know the processes for evaluating library automation packages and
understand the trends in developing library automation software packages.
3.1 INTRODUCTION
In this Unit we are going to study the library automation packages. We have
already covered different aspects of library automation in Unit 1 and processes
and workflows of library systems in Unit 2. This Unit aims to introduce you to
the applications of library automation software for different workflows in a library
system and its roles in providing information services to users and MIS services
to library staff. Mukhopadhyay (2006) outlined the role of typical library
automation software for two major subsystems of a library – operational
subsystem and administrative subsystem (see Fig. 3.1).
3.2.2 Evolution
You already know after covering the Unit 1 that the library automation process
underwent five eras on the basis of technological improvements in computer
programming, database management system, network capabilities and web
integration. To respond these changes, library automation software also improved
considerably through five different generations. Mukhopadhyay in 2006 reported
a comparative account of four generations of ILSs. Use of cloud computing,
web-scale management, linked open data and web 2.0 technologies initiated the
fifth generation of ILSs. This section points out major technological features of
five different generations of ILSs and next section (3.2.3) gives a comparative
account of five generations of library automation software against the features
earmarked by Mukhopadhyay (2006).
• The first generations ILS packages were piecemeal, non-integrated and non-
portable across hardware architectures and software platforms. These
packages were module-based systems with no or very little integration
between modules. Circulation module and cataloguing module were the
priority issues for these systems and were developed to run on specific
hardware platform and proprietary operating systems;
• The most important achievements in second generation of packages were
hardware and platform independence. The second generations ILSs become
portable between various platforms with the introduction of UNIX and DOS
based systems. The ILSs of this generation offer links between systems for
specific functions and are command driven or menu driven systems;
• The most important features in third generation of packages were GUI,
seamless integration of modules and relational model based client-server
architecture. The third generations ILS packages are fully integrated systems
based upon relational database structures and client-server architecture. They
embodied a range of standards, which were a significant step towards open
system interconnection. Colour and GUI features, such as windows, icons,
menus and direct manipulation have become standards and norms in this
generation;
• Web architecture, Unicode and digital media archiving were the major
attributes of the fourth generation ILSs. The fourth generations ILSs were
based on web-centric architecture and facilitate access to other servers over
the Internet. These systems were Unicode complaint and allow accessing
multiple sources from one multimedia graphical user interface; and
• The present of the fifth generation ILSs are adopting rapidly cutting edge
technologies like web-scale management, cloud computing, web 2.0 features
on the basis of AJAX (Asynchronous Java and XML) technology,
Application Program Interface (API), and linked open data. Rising of open
source ILSs and implementation of open standards are also remarkable
features of this generation.
13 Interface Command Menu driven Icon driven Icon driven Web 2.0-
driven (CUI) (GUI) with Web and enabled
(CUI) Multimedia interfaces
(GUI)
18 Distribution Close and Close and Close and Both close Mainly open
mode in-house proprietary proprietary and open source
source
96 ......................................................................................................................
3) Enumerate features of 5th generation ILS. Library Automation –
Software Packages
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
Please remember the examples are only illustrative not comprehensive. There
are several ILSs in use in Indian libraries both from commercial and open source
domains. In the close source group the LibSys and SOUL are dominating ILSs,
and in the open source group Koha and NewGenLib are the most popular ILSs.
Some libraries in India are using WEBLIS which is based on CDS/ISIS. It has
already been mentioned that the availability of open source ILSs helped in large-
scale library automation in India as far as school libraries, college libraries and
public libraries are concerned. Till date around fifteen open source ILSs are
available for use. However, we may go for categorising open source ILSs as per
the maturity level in terms of architecture, data model, core modules, support for
standards, multilingual data processing ability, user services and interoperability.
The Kuali ILS is an experimental open source library automation software as it
is trying to implement the OLE and ILS-DI recommendations for developing the
next generation automated library system.
98
3.3.2 Categorisation by Place of Origin Library Automation –
Software Packages
Mukhopadhyay (2001, 2005) grouped ILSs available in India on the basis of
place of origin. This grouping later on was adopted by many researchers in the
field. It includes three fundamental categories – ILSs of foreign origin, ILSs
developed over ILSs (or textual database management systems) of foreign origin
and ILSs of Indian origin. This grouping may again be sharpened by dividing the
packages on the basis of size of library systems i.e. large library system, medium
range library system and small range library system.
99
Library Automation 5) Categorise ILSs available in India with example.
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
The research study of Müller (2011) identified three matured open source ILS
namely Evergreen, Koha and PMB. We are going to study these three open source
ILSs along with NewGenLib as a special case as it is originated from India.
3.4.1 Evergreen
Evergreen (http://evergreen-ils.org/) is originated from public library domain in
2006 like Koha (released in 2000 as open source ILS). The Evergreen Project
was started in 2006 by the Georgia Public Library System to support 275 public
libraries in the state of Georgia, US. This Client-Server open source ILS is based
on a robust, scalable, message-passing framework – OpenSRF, available under
GNU GPL, version 2, and currently used by over 1000 libraries around the world.
101
Library Automation It has modules for circulation (with sophisticated fiscal management), cataloging
(with comprehensive MARC 21 based catalogue editor), Web catalog, and
statistical reporting, acquisition and serials control. It also supports the SIP2
protocol for self-check The current relase is version 2.6 (released in April 2014)
and the next release (version 2.7) is due in September 2014. It has comprehensive
documentation (http://docs.evergreen-ils.org/), wiki (http://evergreen-ils.org/
dokuwiki/doku.php), and feature request facility.
System requirements
Evergreen is based on client-server architecture. It means that at server level we
need to install server version of Evergreen and in client machines client version
of Evergreen need to be installed and configured. The minimum hardware
requirements of server and client machines are as follows:
Server level
• A high-end desktop or entry-level server.
• 1GB RAM, or more (if server runs a graphical desktop).
• Architecture to run Unix-like Operating System (any flavour of Linux).
• Ports 80 and 443 should be opened in for TCP/IP connections to allow OPAC
and staff client connections to the Evergreen server.
• Network to establish server-client connections.
Client machines
• Low-end desktop with Windows (XP, Vista, or 7/8), Mac OS X, or Linux
operating system.
• A reliable high speed Internet connection.
• 512MB of RAM.
• TCP protocol to connect Evergreeen server at ports 80 and 443.
• Barcode scanner and printer (optional).
Companion software
Apart from Evergreen server and client software, the server machine requires
following companion software to run server version of Evergreen:
4) Unix-like Operating System.
5) PostGreSQl as RDBMS (version 9 or later).
6) Apache as Web server (version 2.x).
7) OpenSRF (version 2.3.0 or later).
8) libdbi-libdbd libraries.
Major Features
The general features of evergreen ensure stability (even under extreme server
load), capability (robust handling of high volume of transactions and concurrent
users), flexibility (to accommodate the varied needs of libraries), security (to
protect our patrons’ privacy and data) and interactivity (to facilitate patron and
staff in using the system). Apart from these features, it supports all sorts of core
activities like:
102
• System administration (privilege control, user and group management, Library Automation –
Software Packages
cataloguing editor control, log records management, system parameters
settings, report generation, granular access control, search enhancing, Z39.50
server and client settings, module administration, SMS gateway management,
federated search control, EDI based acquisition control, theme and skin control
for fine tuning user interface, data migration, backup and restoration etc.);
• Acquisitions (acquisitions settings, cancel/suspend reasons, claiming,
currency types, distribution formulas, EDI (electronic data interchange),
exchange rates, fund tags, funding sources, funds management, invoice
menus, line item features [alerts appear in a pop-up box when the line item,
or any of its copies, are marked as received], providers [vendor/supplier
based profile that includes contact information for the provider, holdings
information, invoices, and other information.]);
• Cataloguing (comprehensive MARC editor, authority data control, model
data entry worksheet, authority lists support, multilingual data entry,
integration of external resources, authority control through MARC 21
authority format, thesaurus integration (eleven number of thesauri are
available and cataloguer can create new thesauri), creation of browsing
categories, record display control, link checker (helps to verify the validity
of URLs stored in MARC records), cross-linking of items (facility to link
items to multiple bibliographic records), distributed cataloguing through
Z39.50 client, bibliographic data export/import, bibliographic search
enhancements – supports for advanced search operators);
Important URLs
• Downloading (http://evergreen-ils.org/egdownloads/);
• Documentation (http://evergreen-ils.org/eg-documentation/);
• Users list (http://evergreen-ils.org/dokuwiki/doku.php?id=evergreen_
libraries);
• Wiki (http://wiki.evergreen-ils.org/doku.php);
• Mailing list (http://evergreen-ils.org/communicate/mailing-lists/);
• Blog (http://evergreen-ils.org/communicate/blog/);
• IRC (http://evergreen-ils.org/communicate/irc/); and
• Book (http://en.flossmanuals.net/_booki/evergreen-in-action/evergreen-in-
action.pdf).
Remark
Evergreen open source ILS has improved a lot in recent years and presently
considered as the model ILS for managing library consortia and library networks.
However, the above mentioned features of Evergreen suggest that the ILS can be
deployed in any type or size of individual library to support core automation
workflow as well as many value-added features.
3.4.2 Koha
As you know already, there are now almost fourteen open source ILS in the
domain of library automation. But Koha is the first open source ILS (released in
2000 as open source) and possibly it is now the most feature rich open source
ILS. Koha changed the rule of game in the ILS market and set trends in many
ongoing changes in the area of library automation. Koha was originated in public
library system of New Zealand. In Maori language Koha means an unconditional
gift. The first version (1.0) of Koha made available for downloading as open
source software in July 2000. The current stable version is 3.14.06 (released in
April 30, 2014). The Koha ILS community is very active and in every month the
developer community provides a bugfix release. Koha versions with new features
are released in every six months (for example the next stable version 3.16 is
expected to be released in June 2014). Koha is an integrated library management
system that was originally developed by Katipo Communications Limited of
Wellington, New Zealand for the Horowhenua Library Trust (HLT), a regional
library system located in Levin near Wellington. In 1999, Katipo proposed
developing a new system for HLT using open source tools (PERL, MySQL, and
Apache) that would run under Linux and use Telnet to communicate with the
branches. The software was in production on 3rd January 2000, and released
under the GPL for other people to use in July 2000. Koha 1.01 was released on
August 9, 2000. Koha is essentially based on LAMP architecture. Here L is
Unix-like OS (different flavours of Linux); A is Apache Web server; M is MySQL
105
Library Automation RDBMS and P is PERL programming environment. Koha is pioneer in a number
of technological achievements such as use of Web 2.0 tools, integration of
authority format and bibliographic data format, availability of OPAC interface
in 25 different languages, implementation of Z39.50 server and OAI/PMH
compatibility, in built support for social networking tools, independent branch
management, Web-based self issue, use of open standards for different modules
and granular system administration facilities.
System requirements
Koha is based on Web architecture. Both staff interface for professional activities
and public access interface for retrieval are available through Web browser. This
Web-enabled open source ILS supports 24×7 mode of access for both for staff
and users. Another important advantage of the Web architecture is no requirement
of installation of client software in the end-user terminals. A web browser (like
Firefox, Chrome etc.) may act as client software at end user terminal. This feature
of Koha reduces maintenance works to a great extent in a large campus library
(for example we need to install, configure and maintenance Koha only at the
server; at client level no Koha specific maintenance is required as client machines
access Koha through a preloaded Web browser). In short, at server level we need
to install Koha and client machines can access Koha server through Web browser
(most of desktops and laptops are preloaded with web browser). The minimum
hardware requirements of server and client machines are as follows:
Server level
• A high-end desktop or entry-level server
• 1GB RAM, or more (if server runs a graphical desktop)
• Architecture to run Unix-like Operating System (any flavour of Linux but
Debian and its derivatives like Ubuntu are mostly in use)
• Ports 80 and 8080 should be opened in for TCP/IP connections to allow
OPAC and staff client connections to the Koha server. These two ports are
default ports for OPAC and staff interfaces respectively but the ports can be
changed as per the network settings of the library
• Network to establish TCP/IP connections.
Client machines
• Low-end desktop with Windows (XP, Vista, or 7/8), Mac OS X, or Linux
operating system
• A reliable high speed Internet connection (optional)
• 512 MB of RAM
• TCP/IP protocol to connect Koha server at ports 80 and 8080 (or other ports
as desired)
• Barcode scanner and printer (optional).
Companion software
Apart from Koha, the server machine requires following companion software to
run server version of Evergreen:
9) Unix-like Operating System (Koha users prefer Debian, Ubuntu and CentOS)
106
10) MySQL as RDBMS (version 5.5 or later) Library Automation –
Software Packages
11) Apache as Web server (version 2.x)
12) YAS toolkit
13) PERL programming environment (version 5.10 or later) and PERL modules
(version 3.14 of Koha requires a total of 139 PERL modules).
Major Features
Koha is considered as the first and the best ILS from open source domain. It is a
global The Koha developer team explored many emerging possibilities to redefine
the scope of ILS such as OAI/PMH server, Z39.50 server, OPAC in 25 languages
(the list is growing everyday), options for two text retrieval engines (Sebra and
Apache-Solr), and options for two cataloguing interfaces (default cataloguing
template and Biblios template). However, the major features are as follows:
108
Library Automation –
Software Packages
Special features
The Koha open source ILS originated as ILS has many special or unique features.
Some of the important special features are:
Enhanced features
• Can be integrated with free bibliographic data services (XISBN, Amazon,
ThingISBN)
• Full authority control
• Compliant fully with Unicode 5.1
• Can be used as CMS (Integration of ILS and CMS)
• Easy control of contents/news/running text
• Can easily be integrated with wiki, blogs etc.
• Supports emerging standards like NCIP, MARC-XML, DCMES, METS
• Supports sophisticated search features – Boolean, Relational and Positional
operators
• Any report generation.
Standard supports
• SRU/W, Z39.50, UnAPI (http://unapi.info/) , COinS/OpenURL
• OpenSearch (http://opensearch.a9.com/)
• Records are stored internally in an SGML-like format and can be retrieved
in MARCXML, Dublin Core, MODS, RSS, Atom, RDF-DC, SRW-DC,
OAI-DC, and EndNote;
• OPAC can be used by citation tools such as Zotero
• Koha 3.x includes support for 3M’s Standard Interchange Protocol (SIP2),
using the OpenNCIP libraries (http://openncip.org)
109
Library Automation • Cross-platform, multi-RDBMS architecture
• News writer, label creator, calendar, OPAC comments, MARC staging and
overlay, notices, transaction logs, guided reports with a data dictionary and
task scheduler, classification sources/filing rules etc.
Web 2.0 features
• Can generate RSS (including ATOM) feed for search query
• Supports information mashup (OPAC can be linked with book jacket service,
book rating/review from Amazon, Google books, Syndicate LibraryThing,
Open Library etc.)
• Users can submit comments/rating/tags for any item from any device (mobile
OPAC)
• Can be integrated easily with many Web 2.0 tools like zoreto, delicious, etc.
Important URLs
• Downloading (http://koha-community.org/download-koha/);
• Documentation (http://koha-community.org/documentation/);
• Users list (http://wiki.koha-community.org/wiki/Category:Koha_Users)
• Wiki (http://wiki.koha-community.org);
• Mailing list (http://koha-community.org/support/koha-mailing-lists/);
• Free support (http://koha-community.org/support/free-support/);
• IRC (http://koha-community.org/get-involved/irc/); and
• Calendar of events (http://koha-community.org/calendar/).
Remark
Koha has already established itself as a global trend setter in the domain of ILS.
Many libraries in India are using Koha ILS such as Delhi Public Library system,
Konkan Public Library system etc. There are almost 2500 installations of Koha.
The inspiring examples are the National Library of Venesuela (7.5 million
volumes), Delhi Public Library (1.4 million volumes), and the United Nations
Food and Agriculture Library (1 million volumes). Koha provides mature support
for all major library standards including MARC21 (a family of five standards),
UNIMARC, Z39.50 (server and client), SRU/SRW, SIP2, OAI/PMH, Unicode
etc. Koha presently serves the needs of a wide range of libraries from academic
to public and from special and research libraries to corporate libraries.
3.4.3 NewGenLib
NewGenLib or NGL started as commercial ILS in 2005 and made available as
open source ILS under GNU GPL in 2008. NewGenLib is the result of
collaboration between a charitable trust called Kesavan Institute of Information
and Knowledge Management (KIIKM), Hyderabad and Verus Solutions Pvt.
Ltd. It is a platform independent ILS that can be installed in both Windows and
Unix-like OS. NGL has five functional modules – technical Processing
(Cataloging), circulation, acquisitions, serials management and web OPAC
including administration for parameters settings and report generation. The
features of the ILS are:
110
• Architectute (completely web based and adheres to International standards, Library Automation –
Software Packages
supports web services and allows networking of unlimited number of
libraries, database and operating system independent and uses open-source,
n-tier, and Java based technologies for scalability, reliability and efficiency);
• Enhanced services (Import of MARC data from sources such as OCLC and
freely available web-based resources, Extensive use of setup parameters in
configuring the software to suit specific needs, e.g., in management of fines,
Multi-user and multiple security levels, Automated email facility integrated
into different functions of the software to ensure efficient communication
between library and users, vendors, Module-specific querying in all
modules);
Special features
Functional modules are completely web based. Uses Java Web Start™ Technology
• Compliant with international metadata and interoperability standards:
MARC-21, MARC-XML, Z39.50, SRU/W, OAI-PMH
• Runs on open source components like Java SE, PostGreSQL
• A high degree of scalability
• OS independent - Windows and Linux flavours available
• Z39.50 Client for distributed searching
• Multilingual supports (Unicode 4.0 complaint, easily extensible to support
Indic scripts, storage, processing and retrieval of multilingual data)
• Provision for RFID integration
• Alerting and messaging services integrated into different modules of the
ILS
• Templates for generation of form letters and applies XML-based OpenOffice
templates
• Scope for extensive cutomisation like other open source ILS
• Supports digital media archiving and Android compatible.
Important URLs
• Downloading (http://www.verussolutions.bis/web/content/download);
• Documentation (http://www.verussolutions.bis/web/content/documentation);
• Users list (http://wiki.koha-community.org/wiki/Category:Koha_Users);
• Help from experts (http://www.verussolutions.bis/web/content/do-you-need-
urgent-help-newgenlib-get-expert-help-free-cost);
• Forum (http://www.verussolutions.bis/web/content/forum); and
• Free support (http://www.verussolutions.bis/web/content/get-help-librarians-
my-region).
Remark
NGL is the first open source ILS released from India. It is now a matured open
source ILS and many libraries are using NGL. It is under continuous development,
112
for example recently NGL Touch developed as a library kiosk application. The Library Automation –
Software Packages
features of NGL ILS are quite suitable for Indian libraries for obvious reasons.
Both free and paid supports are vailable for this ILS along side discussion forum,
blog and documentation services.
3.4.4 PMB
Müller (2011) reported that PMB (PhpMyBibli) is improving rapidly and coming
up as a fully featured open source integrated library system. The PMB ILS project
was started by François Lemarchand in October 2002, the then Director of the
Public Library of Agneaux, France. Presently it is managed by PMB Services, an
initiative to support open source software. PMB is Web-enabled ILS and is using
XAMP architecture (X – any OS; Apache as Web server, PHP as programming
environment and MySQL as RDBMS). It is also using AJAX to support interactive
and collaborative framework. This software is easy to install in comapre with
other ILSs from open source domain. It supports both Windows and Linux
platform with XAMP architecture. This open source ILS is available in four
languages interfaces (English, French, Spanish, Italian). The first version was
released in the year 2003 and the current version is 4.1 (released in March 2014).
PMB, as open source ILS was initially available through GNU GPL licensing
but presently it is available against CeCILL free software license. This platform
independent open source ILS supports all basic library automation workflow
alongside some advanced features like OPAC 2.0 and electronic SDI service.
System requirements
PMB is based on Web architecture. It means that only server version is required
to be installed and in client machines Web browsers (like Firefox, Google Chrome,
IE etc) may act as client software to access PMB server. The minimum hardware
requirements of server and client machines are as follows:
Server level
• A high-end desktop or entry-level server
• 1GB RAM
• Architecture to run Windows or Unix-like Operating System
• Ports 80 should be opened in firewall for TCP/IP connections to access
OPAC and staff client of PMB ILS
• Network to establish TCP/IP connections.
Client machines
• Low-end desktop with any operating system
• A reliable high speed Internet connection for enabling AJAX based services
• 256 MB of RAM
• TCP/IP protocol to connect PMB server at ports 80.
Companion software
Apart from Evergreen server and client software, the server machine requires
following companion software to run server version of Evergreen:
14) Any Operating System
113
Library Automation 15) MySQl as RDBMS (version 9 or later)
16) Apache as Web server (version 2.x)
17) PHP programming environment (version 5.x or later).
Major Features
Apart from supporting basic activities and automation operations, PMB is
supporting authority file management, linking of subject headings with UNESCO
thesaurus in cataloguing interface, Web 2.0 features (such as RSS feed, user
tagging), SDI service module, facility to search formula (mathematical and
chemical formulae), links to search external sources (Amazon, US books etc),
shelf management, basic cataloguing of different document forms, on-line help
etc. The regular features are as follows:
Important URLs
• Downloading (http://forge.sigb.net/redmine/projects/pmb/files);
• Documentation (http://www.sigb.net/index.php?lvl=cmspage&pageid=20);
• User community (http://www.sigb.net/index.php?lvl=cmspage&pageid=18);
and 115
Library Automation • Technical support (http://www.sigb.net/index.php?lvl=cmspage&
pageid=17).
Remark
PMB is quite suitable for small and medium scale libraries. The ease of installation
and configuration makes it a suitable candidate for public libraries in India. It
can be customised to a great extent to incorporate Indian languages. The only
problem of this open source ILS is that the PMB portal is available in French
language only and this ILS supports only UNIMARC format.
3.5.1 LIBSYS
LibSys (http://www.libsys.co.in/) is an indigenous ILS designed and developed
by LibSys Corporation, New Delhi in 1984. LibSys is presently available in six
different editions/versions to suite requirements of different types of libraries.
These are:
LIBSYS 7: This version of LibSys has features like Unicode Support, Federated
Searching, Customisable look and feel, User notification through E-mail and
SMS, RSS feeds and integration with Google Books, BookFinder, etc. and
interactive features like online reviews, ratings, renewals, reservations etc. The
modules are – Acquisition, Cataloguing, Circulation, Serials, Article Indexing,
Web OPAC, Customisable Reports. LibSys 7 supports following standards –
MARC21, Unicode, SRU/SRW, Z39.50, NCIP (NISO), SICI Barcode.
LSEase: The basic features of this version of LibSys are – independent of
Operating System, support for digital media archiving, user-friendly workflow,
user-defined security, may be extended to Web architecture.
LSAcademia: It is an ERP Solution to integrate administration of academic
institutions and ILS. Apart from library management, it supports Admissions,
Student Management, Academic Administration, Examination/ Results, Fee
Management, Learning Triggers, Time Table, Student/ Parent Portal, Faculty/
Director Portal, Bus Use, Hostel, Staff Management, Payroll, Alumni etc.
LSmart: It integrates RFID and EM hardware from world renowned
manufacturers with LIBSYS and thereby offers following add-on services - RFID
Tags on Books/Documents and CD/DVDs, Multiple item processing
simultaneously, Self-use Kiosk for check-out/check-in, Book Drops for quick
check-in of items, Hand held RFID readers for Shelf Management, EAS Security
Gates, Books Sorters to reduce items replacement times on shelves.
LSNet: This version of LibSys evolves around a virtual library that includes the
collection of books, CD/DVDs, reference material, etc through a single Web-
enabled search interface. It may be integrated with LIBSYS 7 to provide platform
for sharing e-content, promotion of library materials, value added services like
book updates, reviews, upcoming titles etc.
LSDigital: It is a complete Digital Resource Management System (DRMS) which
can be integrated with LIBSYS 7 for value-added digital contents dissemination.
The integration provides Implicit interaction with LIBSYS database, Full-text
and bibliographic searching through LIBSYS OPAC, Converts different data
into format of choice (PDF, Doc, etc.), Define & organises library data structure
/ flow according to needs and Supports various image manipulations
117
Library Automation 3.5.2 SLIM
SLIM (System for Library Information Management) a client-server architecture
based ILS developed by Algorhythms consultants Pvt. Ltd., Pune (http://
slimpp.com). It is a module-based LMS that offers wide range of functionality
for library management. Presently there are two versions of SLIM – SLIM 21
and SLIM++.
SLIM 21: The are three levels of SLIM 21 version – Basic Level (Acquisition,
Cataloguing, Serials control, Circulation and OPAC); Enterprise Level (Basic
Level integrated with Web based OPAC, Selective Dissemination Information
(SDI), Inter Library Loan (ILL), Current Awareness Service (CAS), Web
Proposals, Statistical Analysis); and L2L Level (Basic level + Enterprise level
integrated with Z39.50 client, Z39.50 server, MARC-XML). All of these three
levels are supported by additional utilities like Colon classification shelving order,
Touch Chip Interface (Biometrics), Newspaper monthly billing, Smart Card /
RFID interface, Library Map and News clipping publishing, Multilingual data
processing and retrieval, Support for standards like NCIP, SIP2, ISO-2709 etc.
3.5.3 SOUL
SOUL (http://www.inflibnet.ac.in/soul/) is one of the oldest ILS initiative in India.
The story of SOUL (Software for University Libraries) started with the
development of ILMS (Integrated Library Management Software) by INFLIBNET
in collaboration with DESIDOC. INFLIBNET later decided to develop a state-
of-the art, user friendly, Window based system which will contain all the features/
118
facilities available with other ILSs in the market. As a result, the first version Library Automation –
Software Packages
(version 1.0) of SOUL (Software for University Library) released in February
1999 during CALIBER-99 at Nagpur. SOUL uses RDBMS on Windows NT
operating system as backend to store & retrieve data. The SOUL has six modules
– Acquisition; Cataloguing; Circulation; Serials Control; OPAC and
Administration. The modules have further been divided into sub-modules to
take care of various functions normally handled by the university libraries. The
features of SOUL version 1.0 are: Window based user friendly system with
extensive help messages at affordable cost, Client-server architecture based system
allowing scalability to users, Uses RDBMS MSSQL to organise data, Multi-
user software with no limitation for simultaneous access, User friendly OPAC
with web access facility, Supports bibliographic standards like CCF & AACR II
and ISO 2709 for export & import facility, Provides facility to create, view &
print records in regional languages, Supports LAN & WAN environment and
Available in two versions – university library version and college library version.
The second version of SOUL, named as SOUL 2.0 was released in January 2009.
SOUL 2.0 provides two options for back end DBMS - MS-SQL and MySQL.
SOUL 2.0 is compliant to international standards such as MARC 21 bibliographic
format, Unicode based Universal Character Sets for multilingual bibliographic
records and NCIP 2.0 and SIP 2 based protocols for electronic surveillance and
control. MARC-XML as standard for export/import, Supports cataloguing of
electronic resources such as e-journals, e-books, virtually any type of material,
Supports requirements of digital library and facilitate link to full-text articles
and other digital objects, Supports ground-level practical requirements of the
libraries such as stock verification, book bank, vigorous maintenance functions,
transaction level enhanced security, etc.
ABCD
ABCD (Automation of Libraries and Documentation Centers) is a comprehensive
Web-enabled integrated library automation system developed by BIREME, Brazil.
It is based on CDS/ISIS as back end databases and WWWISIS as middle-ware.
The web interface of CDS/ISIS, called WWWISIS was developed by BIREME
in 2005. BIRME in 2010 developed ABCD by using CDS/ISIS as database and
WWWISIS as CGI script for designing Web-enabled ILS. It includes all major
activities generally expected from a third-generation ILS. Core modules are –
Cataloging, Circulation, Acquisitions, Statistics and Reports and OPAC. It also
includes a facility called “Adds a Site”. This facility is a built-in feature in ABCD
to support content management system (CMS). It allows easy production of a
library website with integrated meta-search option. In ABCD, cataloguers may
use predefined bibliographic formats (like MARC21, UNIMARC, CEPAL) or
they may create custom format by using FDT (Field Definition Table) utility of
CDS/ISIS. As a whole, ABCD is a very flexible and versatile ILS for use in
libraries and information centres where non-standard database-structure create
non-bibliographical applications like experts databases, data bank and technology
directory. ABCD (present version is 1.0) includes two circulation interfaces – i)
standard loans-module; and ii) advanced loans module. The advanced circulation
module provides external links with SQL-databases. The upcoming version 2.0
of ABCD will include digital media archiving module. This module will provide
facility to handle textual objects and multimedia objects with full-text indexing
facilities. The problem of ABCD is that it is not Unicode-compliant (the problem
is inherited from CDS/ISIS) and therefore, cannot handle Indic scripts based
documents. ABCD is available under GPL (version 3) and independent of
121
Library Automation Operating System (bowser based cross-platform system) with standards support
like MARC 21, MODS, OAI, XSLT. The programming environments are open
source components like Java, JavaScript and PHP. As a whole ABCD is based
on an array of technologies like ISIS database, ISIS formatting language, CISIS,
ISIS Script, ISIS NBP, Java Script, Groovy and Jetty, PHP, MySQL, Apache and
YAS
Resources:
• Technological features (http://reddes.bvsaude.org/projects/abcd/wiki/
Features);
• Wiki (http://wiki.bireme.org/en/index.php/ABCD);
• Download (http://bvsmodelo.bvsalud.org/download/abcd/ABCD_1.0_wis_
full.exe);
• Project homepage (http://reddes.bvsaude.org/projects/abcd).
e-Granthalaya
e-Granthalaya has improved a lot recently through continuous up-gradation. The
current release (version 3.0) supports almost all core activities of an ILS alongside
advanced features like e-book management, Web-OPAC, predictive serials
control, Unicode-compliant multilingual support, easy data migration and MARC
21 support for both bibliographic and authority data. This ILS is a product of
National Informatics Centre (NIC), Department of Electronics & Information
Technology, Ministry of Communications and Information Technology,
Government of India. The only problem of e-Granthalayas is its dependency on
Microsoft products (commercial close source software) like VB.NET or ASP.NET
and MSSQL server 2005. The software can be implemented either in stand-
alone or in client-server mode. In client-server mode database and WebOPAC
are installed on the server PC while the data entry program is installed on client
PCs. The version 3.0 of e-Granthalaya supports union catalog output. The major
features of this freeware ILS are as follows:
• Technological features (runs on Windows Platform Only (Win XP/vista/7/
8/Server 2003/2008) on LAN/WAN environment, UNICODE Compliant,
supports data entry in local language);
• Administration (Module - Wise Permission to the software Users, Work-
flow as per Indian Libraries and Retro-Conversion as well as Full Cataloguing
Modes of Data Entry, Library Statistics Reports);
• Cataloguing (Authority Files/ Master tables for Authors, Publishers, Subjects,
etc, Multi-Vol, Multi-Copy and Child-Parent Relationship pattern, Z39.50
Client Search Built-in, Export Records in CSV/Text File/MARC 21/MARC
XML/ISO:2709/MS ACCESS/EXCEL formats, Centralised Database for
member libraries, Import Data from any structured Source (MARC21/
EXCEL), Generate Bibliography in AACR2, Data Entry Statistics Built-In,
e-Books management with digital files in pdf or other formats);
• Acquisition (Main/Branch Libraries Acquisition/Cataloguing, Print
Accession Register, Bulk accessioning in single click, Budget and account
control, Budget Modules with Bill Register Generation, Manages multi-
budget heads, Exchange rates, Report generation, Printing accession register
etc.);
122
• Circulation (Issue/return, Membership module, Bar-coding support, Library Automation –
Software Packages
comprehensive circulation reports);
• Serials control (Subscription/renewal with auto-generate schedule, CAS/
SDI Services and Documentation Bulletin, Micro-Documents Manger
(Articles/Chapter Indexing));
• OPAC and Utilities (Search Module built-in with basic/advance/boolean
parameters, Full Text News Clipping Services, Digital media integration
with uploading / downloading of pdf/html, etc documents, Web Based OPAC
Interface, Photo Gallery available for uploading photo and pictures of the
organisations - published on the Library Web site).
Resources
• Portal (http://egranthalaya.nic.in/);
• Forum (https://lsmgr.nic.in/mailman/listinfo/egranthalaya_forum);
• Software request (http://egranthalaya.nic.in/Request%20Form.pdf);
• Documentation (http://egranthalaya.nic.in/eG3_UserManual.pdf).
WEBLIS
WEBLIS stands for Web based Library and Information System. This Web based
ILS is based on CDS/ISIS. It has been developed by the Institute for Computer
and Information Engineering (ICIE), Poland by combining CDS/ISIS and WWW-
ISIS engine (also developed by ICIE). It is freeware ILS and provides basic
library workflow support through four modules – Cataloguing system, OPAC
(search), LOAN module, Statistical module. WEBLIS is presently supported by
UNESCO. The features of these four components of WEBLIS are:
3) OPAC (Simple and advanced search, Search history, Saving queries function,
and ISIS Query language facilities, Thesaurus based search support, ISO-
2709 based export/import);
• Value-added services: Patron self service through RFID & Smart card (self
circulation, self reservation etc.), Online user training/orientation, Stock
verification facility, Members photo ID card generation, Barcode generation,
Fine calculation & receipt generation, Gate pass generation, Bulletin board
services & e-mail reports, Electronic SDI, CAS support, Digital media
archiving support.
Functional checklist: The following general features are part of software module
testing, and each functional activity must be tested or conducted during the
evaluation process:
• Searching Capabilities (All modules)
• Data Entry and Editing (All modules)
• Bibliographic/item File and Maintenance
• Cataloguing editor (Cataloguing)
• Authority Control (Cataloguing)
• Inventory (Circulation)
• Check-out (Circulation)
• Renewal (Circulation)
• Circulation/Management Reports (Circulation)
• Check-in (Circulation)
• Fines and Fees (Circulation)
• Notice Production (Circulation)
• Holds (Circulation)
• Recalls (Circulation)
• Patron File (Circulation)
• Reserves (Circulation)
• Portable Back-up Units
• Report Writer
• Acquisitions
• Serials
• Electronic Databases
• Gateways
• Network Operations
• Z39.50 Client and Server
125
Library Automation • Inter-Library Loan
• Web Accessibility
• Integrated Archiving
• Self Registration
• Statistics Generation
• Export and Import
• Fund Accounting
• Digital media archiving.
Data conversion and backup utility: The ability of the ILS in terms of support
for data conversion from other library systems and adherence to the international
bibliographic data standards and protocols should be checked extensively. In
this age of shared cataloguing systems and web integration, the ILS should also
support metadata schemas and interoperability issues like XML, RDF and OAI/
PMH. Backup facility in suitable media is also to be checked in view of data
recovery at the time of need.
Hardware and third party software requirements: The ILS should provide a
complete list of hardware requirements (processor type and RAM) for server
and client machines, operating system requirements and back end RDBMS (with
version) requirements. Evaluation should be based on total cost for minimum
hardware and third party software requirements of the package.
• The package must have support from the software vendor for hardware and
software maintenance, data conversion, emergency and on-call support and
disaster management.
127
Library Automation 3.7.3 Specific Parameters of Evaluation for Freeware and Open
Source ILSs
Public Library Association (PLA) working under ALA recommended a set of
criteria in selecting open source ILS for library (see http://www.ala.org/pla/tools/
technotes/opensourceils). These criteria apart from the general criteria discussed
above must be kept in mind in selecting open source ILS. The minimum essential
criteria specifically meant for open source ILSs are as follows –
• Currency and regular releases: The open source ILS under consideration
must have at least two substantial releases a year along with a road map for
future development activities.
• Core modules: All core activities of a library like acquisition, cataloging,
circulation, serials control, systems administration and patron access catalog
modules must be available. Value-added services that require to run library
operations smoothly (like barcode generation, fine calculation, gate pass
printing, member card printing, web-OPAC etc.) must be included in road
map of development.
• Standard Data Formats: MARC 21 family of standards (at least MARC
21 bibliographic format and Authority format) should be supported alongside
export/import facilities (based on ISO-2709/MARC-XML). Availability of
UNIMARC format in addition to MARC 21 standards is an added advantage.
• IPR and Licensing: Current source code and technical documentation are
available for downloading under the GNU General Public License.
• User base: The product is currently in use in a significant number of libraries.
• Scalability: Scalability should not be an issue; it means there should be no
risk of database size or activity levels exceeding the capacity of the software.
• Developer group: A dedicated group of developers ensures the progress of
open source ILS under consideration such as adopting cutting edge
technologies in developing new features and facilities.
Of course, the main OSS ILS in the U.S., Evergreen and Koha, meet all of these
criteria. Libraries that have already decided to choose one of these systems will
need to consider other factors. The Massachusetts Library Network Cooperative
has released a useful list of points comparing these systems (http://masslnc.
cwmars.org/node/1892).
General
• Improve discovery and use of library resources via an open-ended variety of
external applications that build on the data and services of the ILS;
• Articulate a clear set of expectations;
• Make recommendations applicable to both existing and future systems and
technologies;
• Support interoperation and cooperation with applications outside the
traditional library domain;
• Ensure that the recommendations will be feasible to implement; and
• Be responsive to the user and developer community.
129
Library Automation • BDI should facilitate metadata harvesting, availability checking for resources
(within and outside of library system) and bibliographic request functionality;
• Data aggregation, Real Time search, Patron functionality, and OPAC
interaction;
• Compatibility with the established and emerging standards like OAI/PMH,
SRU/SRW, METS, MODS, DCMES, MARC-XML, NCIP etc.;
• Facilities to expose bibliographic records to different external discovery
tools (such as SOPAC, Vufind, etc.).
Data aggregation
• Many external discovery applications need to maintain external copies of
ILS data and thereby supports should be provided for extracting, or
harvesting, ILS data (bibliographic, authority, holdings, and other item
metadata (such as circulation information) in bulk;
• Facilities must be provided for – selective harvesting for external metadata
transformation, cleanup, relationship (FRBRising), vocabulary mapping and
other processing services;
• Bibliographic records should be in a well-specified format and each record
should have a unique persistent identifier;
• Bibliographic records must be available in interchangeable native format
(for example, a MARC record stored as relational table elements could be
returned as native marc21, or as MARC-XML schema, or DCMES or MODS
and METS; and
• Support for compatibility with different text retrieval engines (for example,
a Lucene index of bibliographic records that can be searched with facets
using Solr).
Select Entity
This function describes the processes of acquisition of an entity and includes
workflow like Obtain Metadata and Create Metadata. The resources may be gifts,
approval plan items, firm orders, interlibrary loan requests, reserve requests,
remote location requests, publication references, trial databases. Metadata can
be obtained (if available) or created for descriptive, holdings (e.g. what is available
and being considered for acquisition), authority, financial, or other types. The
metadata may be harvested from or deposited by another system.
Acquire Entity
Associated license/registry terms are managed and documented within the system
through this function. The workflow includes – selection of entity, assigning
supplier/vendor, fund management, determine claiming cycle etc. The invoice
process and payment activity may be executed manually or electronically (by
using protocols such as: EDIFACT; ANSI X12, XML EDI.).
Describe Entity
This function is associated with description of physical or digital entities
(resources, collections, people, organisations, services, events, courses, facilities,
finances, relationships, etc.). It includes process to obtain, create, modify, delete,
or expose metadata for an entity.
131
Library Automation Deliver Entity
This function describes the process where a user submits a request for a service
or resource and entity supplied to him/her to satisfy information demand. Entities
cover a wide range like physical/digital, returnable/consumable, free/fee based,
local/trans-local, and ownership/external.
Manage Entity
This function covers processes that track the life-cycle of an entity including
preservation, conservation, evaluation, retention, relocation, duplication, version
preference, rights management, binding, repair, reformat, replacement, and
withdraw. The workflow includes Preserve/Conserve Resource, Manage
Inventory, Configure Metadata, Manage Rights, and Reformat Resource.
Deliver Module
• This module covers the interactions between the library, its collection, patrons
and discovery systems and provides the basic features/functions to manage
patron records, item records, circulation tasks, holds management, fine
calculation, NCIP standards compliance with local parameters e.g., patron-
related blocks, item-related blocks, loan periods, notice types and notice
frequency, etc.
System Integration
• Systems Integration is the link between the three modules: Select and Acquire,
Describe & Manage, and Deliver. Kuali uses a common middleware suite
called Kuali Rice to achieve service oriented architecture (SOA). The SOA
supports interoperability related with identity management, acquisitions/
financial accounting, course and learning management, and student
information systems.
132
Library Automation –
Software Packages
3.9 SUMMARY
This Unit covered ILS available in India in depth. It provided a historical and
theoretical foundation of library automation software development spanning last
sixty years and under five different generations. Five generations of ILSs against
a set of parameters framed in view of the technologies in use and services expected
to be available have been compared. After discussing features of different
generations of ILS, comparison of ILSs available in India on the basis two trains
of characteristics – distribution policy (commercially available ILS, open source
ILS and freeware ILS) and place of origin (foreign, Indian and originated in
133
Library Automation foreign and developed in India) has been done. This Unit discussed features of
four most promising open source ILSs, four commercial ILSs (selected on the
basis of their user base in India) and three visible freeware ILSs. As evaluating
exercise is considered as one of the most important tasks in library automation
process, this Unit discussed evaluation parameters under three heads – generic
(applicable to all kinds of ILS irrespective of distribution policy and place or
origin), specific parameters to be considered for evaluating commercial ILSs
and parameters important for evaluating open source ILSs. This Unit ends with
a brief discussion on two sets of global recommendations in the domain of library
automation namely ILS-DI recommendations and OLE recommendations. It also
throws light on the impact of these recommendations in future development of
ILS.
3) The major features of the fifth generation ILSs are – AJAX support, Support
for FRBR, FRAD and FRSAD. Support for Linked Open Data, Use of open
interoperability standards, provision of Cloud and Web-scale resource
discovery, and Support for federated search.
4) Open source ILSs are available freely under GNU GPL license, extensively
customisable (as source codes are available) and based on global open
standards in the domain of library automation. The major open source ILSs
are Koha, Evergreen, PMB, Avanti, NewGenLib and so on.
6) There are many open source ILSs of which Koha appeared first in the year
2000. It is now considered as the most feature rich open source ILS in the
world. The user base of Koha is increasing rapidly all over the world. Many
libraries are switching from commercial ILS to Koha because of the following
features – i) Web-centric architecture; ii) compliant with all major standards
in the domain of library automation; iii) OPAC 2.0; iv) use of open source
companion software; v) multi-lingual and Unicode-compliant; vi) supports
all core and value-added features expected from fourth generation ILS
packages; and vii) OPAC available in 25 languages.
9) Virtua ILS, a product of VTLS Inc, US, is one of the most comprehensive
ILSs at the globalscale. The real advantages of this ILS are – i) compliance
with all global standards of library automation, ii) full support for
bibliographic data models like FRBD, FRAD, FRASD; iii) provision for
RDA based cataloguing along side MARC 21 and AACR 2; v) full support
for Web 2.0 architecture to generate interactive user interface; vi) very
sophisticated search mechanims; viii) facility to create customise workflow
for library and many more such facilities. Virtua ILS is used by many national
libraries including National Library of India.
10) Freeware ILSs are available for downloading and use freely but either they
are using companion software which are not open source products (e.g.
e-Granthalaya is based on Microsoft products like Windows OS, MSSQL
RDBMS and ASP.NET programming environment) or based on non-open
source textual database management system (e.g. ABCD and WEBLIS are
based on CDS/ISIS). The visible freeware ILSs are e-Granthalaya, ABCD
and WEBLIS.
135
Library Automation 11) The current version of e-Granthalaya (version 3.0) is client-server mode
integrated library automation package that supports almost all core activities
of an ILS along side some value-added services like news clippings, CAS/
SDI, article indexing, digital media archiving etc. It also supports many
library standards like MARC 21, MARC-XML, ISO-2709 and S39.50
protocol. The main disadvantage of this ILS lies on it’s heavy dependency
on Microsoft products (Windows OS, MSSQL, VB.NET/ASP.NET) which
are not open source software product. As a result a library is getting this
freeware ILS at no cost but companion software procurement places huge
financial burden on the library budget.
12) A framework for evaluation of ILS is required for three major purposes –
i) selection of an ILS for procurement from a short-listed group of ILS; and
ii) selection of an ILS for migration from one ILS to another; and iii)
development of RFP for seeking expression of interest (EOI). The parameters
of selection must be based on following factors – ) service availability
checklist and standards support checklist; ii) functional features; iii)
companion software requirement; iv) hardware support required; v) vendor
reputation (in case of commercial ILS), vi) project duration and release
cycle (in case of open source ILS); vii) data conversion and transfer support;
viii) software architecture; ix) support for cutting edge technologies (like
AJAX, Web 2.0, Linked Open Data) and x) support for training,
documentation, on-call service (availability of forum, wiki and mailing list
in case of open source ILS).
13) The following specific parameters, apart from the generic parameters should
be cheeked in selecting an open source ILS – Currency and regular releases,
Core modules support, Standard Data Formats, IPR and Licensing, User
base, Scalability, and reputaion and duration of Developer group.
3.11 KEYWORDS
Bibliographic metadata: Information about a resource that serves the purpose
of discovery, identification and selection of the
resource. Includes elements such as title, author,
subjects, etc.
EDI : Electronic Data Interchange (EDI) is a standard
method for exchanging structured data, such as
purchase orders and invoices, between computers
to enable automated transactions.
EDIFACT : EDI For Administrations, Commerce and Transport
The concept of utilising a single set of specifications
for bibliographic records regardless of the type of
material they represent.
ERMS : Electronic Resources Management System is used
to manage a library’s electronic resources, primarily
e-journals and databases. Systems can include
features to track trials, license terms and conditions,
usage, cost, and access.
Evergreen : The first open source ILS designed to handle the
processing of geographically dispersed, resource-
sharing library networks and library consortia.
GPL : The GNU General Public License is an open source
license that is used by Evergreen and Koha.
ILS : An automated library system that utilises shared data
and files to provide interoperability of multiple
library functions, e.g. cataloging, acquisition,
circulation, serials, etc.
Interoperability : The ability for two different computer systems to
communicate and exchange information in a useful
and meaningful manner.
MARCXML : A metadata scheme for working with MARC data
in a XML environment.
Metadata : Structured information that describes an
information resource. “Data about data” for an
information bearing object for purposes of
description, administration, legal requirements,
137
Library Automation technical functionality, use and usage, and
preservation.
Metadata harvesting : A technique for extraction of metadata from
individual repositories for collection into a central
catalog.
Module of ILS : Functions specific to a particular system capability
such as the online public access catalog, cataloging,
acquisitions, serials, circulation, etc.
NCIP : NISO Circulation Interchange Protocol (NCIP) is
a standard which defines a protocol for the exchange
of messages between and among computer-based
application to enable them to perform functions
necessary to lend and borrow items, to provide
controlled access to electronic resources, and to
facilitate co-operative management of these
functions.
Open Source : A concept through which programming code is
made available through a license that supports the
users freely copying the code, making changes it,
and sharing the results. Changes are typically
submitted to a group managing the open source
product for possible incorporation into the official
version. Development and support is handled
cooperatively by a group of distributed
programmers, usually on a volunteer basis.
OpenSRF : Open Service Request Framework is developed by
Evergreen ILS team to achieve load balancing and
service availability.
SIP2 : Standard Interface Protocol Version 2 is a standard
for the exchange of circulation data and transactions
between different systems.
SOA : Service-Oriented Architecture (SOA) is a software
framework for managing loosely-coupled,
distributed services which communicate and
interoperate via agreed standards.
SRU : Search/Retrieve via URL is a standard search
protocol for Internet search queries, utilising CQL
(Common Query Language), standard query syntax
for representing queries.
SRW : Search/Retrieve Webservice is web services
implementation of the Z39.50 protocol that
specifies a client/server-based protocol for
searching and retrieving information from remote
databases.
138
Unicode : A universal character-encoding standard used for Library Automation –
Software Packages
representation of text for computer processing.
Unicode provides a unique numeric code (a code
point) for every character, no matter what the
platform, no matter what the program, no matter
what the language. The standard was developed by
the Unicode Consortium in 1999.
Z39.50 : A NISO and ISO standard protocol that specifies a
client/server-based protocol for cross-system
searching and retrieving information from remote
databases. It specifies procedures and structures for
a client system to search a database provided by a
server.
Zebra : A high performance open source text retrieval
engine for indexing and retrieval, used by Koha as
its primary search system for bibliographic and
authority data.
Breeding, M. The viability of open source ILS. Bulletin of the American Society
for Information Science and Technology, 35.2(2009), pp. 20-25.
Digital Library Federation. DLF ILS Discovery Internet Task Group (ILS-DI)
Technical Recommendation (2008). <www.diglib.org/architectures/ilsdi/
DLF_ILS_Discovery_1.1.pdf>
Hodgson, Cynthia. The RFP writer’s guide to standards for library systems.
Bethesda, Maryland: National Information Standards Organisation, 2002. < http:/
/www.niso.org>
Müller, T. How to choose a free and open source integrated library system. OCLC
Systems & Services, 27.1 (2011), pp.57-78. <http://eprints.rclis.org/15387/1/
How%20to%20choose%20an%20open%20source%20ILS.pdf>
Open Library Environment: The Open Library Environment Project Final Report
(2009). <http://oleproject.org/final-ole-project-report/>
Singh, V. Why migrate to an open source ILS? Librarians with adoption experience
share their reasons and experiences. Libri, 63.3 (2013), pp.206-219.
Yang, S. Q., & Hofmann, M. A. The next generation library catalog: A comparative
study of the OPACs of Koha, Evergreen, and Voyager. Information Technology
and Libraries, 29.3 (2013), pp.141-150.
140
Library Automation –
UNIT 4 LIBRARY AUTOMATION: Software Packages
APPLICATIONS OF OPEN
SOURCE SOFTWARE
Structure
4.0 Objectives
4.1 Introduction
4.2 Open Source Movement
4.2.1 Open Source Software
4.2.2 Open Source Software: Development Path
4.2.3 Open Source Software vs. Commercial Software
4.3 Open Source Software: Philosophy, Principles and Licensing
4.3.1 Philosophy of Open Source Software
4.3.2 Principles of Open Source Software
4.3.3 Licensing of Open Source Software
4.3.4 Open Source and Open Standards
4.4 Open Source Software and Libraries
4.4.1 Use of Open Source Software
4.4.2 Prospects and Problems
4.4.3 Use of Open Standards
4.5 Open Source Software in Libraries: System Level
4.5.1 Open Source Operating System
4.5.2 LAMP Architecture
4.5.3 LAMP Components
4.6 Open Source Software in Libraries: Domain Level
4.6.1 Automated Library System
4.6.2 Digital Library System
4.6.3 Cataloguing Tools
4.6.4 Other Library Activity Tools
4.7 Towards Open Library System
4.8 Summary
4.9 Answers to Self Check Exercises
4.10 Keywords
4.11 References and Further Reading
4.0 OBJECTIVES
After going through this Unit, you will be able to:
• know what is open source movement and how is it improving computing
infrastructure;
• understand differences between commercial and open source software;
• identify advantages of using open source software and open standards in
library system; and
• understand the emerging concept of open library system.
141
Library Automation
4.1 INTRODUCTION
Present library services are software-centric. As per the availability and
distribution policy, software products are divided into two groups – closed source
commercial products and open source free to use products. Commercial software
in the domain of library activities are available against huge license fees along
with separate annual maintenance contracts, updating fees and many other hidden
costs. As a result, adaptation of a commercial LMS in library (for example) is
not one-time capital expenditure but it leads to considerable recurring expenditure
on already strained library budget. Moreover, these commercial LMSs are
basically available in a generic or fit-to-all size model and provide no scope for
customisation to suite the need of a particular library (Mukhopadhyay, 2008).
This is an alarming situation for libraries in India. Libraries are paying huge sum
of money to procure commercial LMS but unfortunately not in a position to
even change the colour of the user interface. Another serious lacuna is the non-
transparent nature of these software in the use of global de jury or de facto
standards.
Application of open source software in different library activities may be a viable
alternative solution to get rid of the problems related with the application of
commercial software. The tradition of open source software started with the
advent of ARPANET (now Internet) in 1969 and boosted with the development
of open source operating systems like GNU Linux. Naturally, one question is
coming to your mind – what is open source software and how is it different.
According to OSI (Open Source Initiative, 2003) – “Open source promotes
software reliability and quality by supporting independent peer review and rapid
evaluation of source code. To be certified as open source, the license of a program
must guarantee the right to read, redistribute, modify, and use it freely”. Open
source software are available freely to end users. Here the term Free has dual
meaning – users are given freedom to customise the source code and these software
are available free of cost. An open source software is attached with four freedoms
– read (source code is available for verification), use (binary code is available
for application), modify (source code is available for modification and
customisation), redistribute (source code in original or in modified form is
available for redistribution).
In the area of library services, the greatest benefit of open source software is the
opportunity for library professionals to work at the system level and to participate
in software development process as co-developers. Fortunately, the domain of
library and information science, right from the beginning of the open source
movement, is benefited through structured effort and software philanthropy. We
have matured ILS like Koha (comparable to any global ILS) from HLT, New
Zealand, comprehensive digital library software like DSpace from the MIT, US
(with support from HP), Greenstone Digital Library Software (or GSDL) from
University of Waikato (presently supported by UNESCO). Apart from these very
popular open source software, the arena is presently fielded with an array of
promising software like MARCEdit and ISISMARC (MARC cataloguing tools),
WEBLIS (ILS based on CDS/ISIS), YAS toolkit (Z39.50 client and server),
Lucene and Solr (Text retrieval engines), Unicode-compliant multilingual tools
etc. Most of these open source software in the domain of LIS are very transparent
in the use of standards and generally deploy open standards for achieving
interoperability.
142
This brief introduction gives you idea on open source software and the possibilities Library Automation:
Application of Open Source
for applications of open source software in enhancing library systems and services. Software
Now we are all set to discuss open source software in depth. The discussion
mainly cover six areas – 1) history, development, features and advantages of
open source; 2) philosophy, principles and IPR issues related with open source;
3) use and advantages of open source software in libraries in general; 4) application
of open source software in library activities at the system level; 5) application of
open source software in library activities at the domain level; and 6) the emerging
concept of open library systems that manages open contents and supported by
open standards and open source software.
143
Library Automation 4.2.1 Open Source Software
Open Source Software (OSS) is not a new idea. You already know that the open
source movement started with the Internet. Recently, technical and market forces
joined together to draw a niche role of open source movement. Open source
movement has all the potentials to define computing infrastructure of the next
century (Marco & Lister, 1987). Open source is a software development model
as well as a software distribution model. OSS development follows Linus
Torvalds’s (Linus Torvalds is the developer of Linux operating system – an open
source system software) style of development – release early and often, delegate
everything and be open to the point of promiscuity. Raymond (2001a; 2001b)
termed this type of software development as bazaar style of development in
comparison with traditional software development process (termed by Raymond
as cathedral model), which is carefully crafted by individual wizards or small
group of experts working in splendid isolation. The Open Source Initiative (2004),
a forum to promote open source software movement as a viable alternative to
commercial software claims –
Definition
The open source movement has been in conscious development for nearly two
decades but the term “open source” itself has been a relative latecomer. Christine
Peterson of the Foresight Institute proposed the term open source in late 1997
during a meeting of small group of open source movement key persons (Raymond,
2001c). This group registered the domain name opensource.org, defined “open
source,” developed Open Source Initiative (OSI) group, designed OSI
certification, and created a list of licenses that meet the standards for open source
certification. In the open source software development model the source code of
software is made freely available along with the binary version so that anyone
can see, change, and distribute it subject to the condition he/she abide by the
accompanying license. According to OSI (Open Source Initiative, 2003a) –
You can easily understand from table 4.1 that the fundamental difference is the
opportunity for customisation. Open source also provides freedom to redistribute
the customised version of the software.
• Free redistribution: The license must allow end users to redistribute the
software, even as part of a larger software package and may not charge
royalties for this right.
• Source code: The distribution must make the source code freely available
to developers.
147
Library Automation • Derived works: The license must allow modifications and derived works
and must allow them to be distributed under the same terms as the license
of the original software.
• Integrity of the author’s source code: The license may require that modified
distributions be renamed, or that modifications be made via patch files rather
than modifying the source code.
• No discrimination against persons or groups: The license must not
discriminate against any person or group of persons.
• No discrimination against fields of endeavour: The license must not restrict
anyone from making use of the program in a specific field of endeavour.
• Distribution of license: The rights attached to the program must apply to
all to whom the program is redistributed without the need for execution of
an additional license by those parties.
• License must not be specific to a product: A program may be extracted
from a larger distribution and used under the same license.
• The license must not restrict other software: The license must not
contaminate other software by placing restrictions on any software distributed
along with the licensed software.
• The license must be technology-neutral: The license should not be framed
on the basis of any individual technology or style of interface.
BSD-style Licenses
BSD-style (Berkeley System Distribution) licenses are identical to the original
license issued by the University of California, Berkeley. These are among the
most permissive licenses and include key features like – i) attribution is given to
the original license holder by including the original copyright notice in source
code files; ii) no attempt is made to sue or hold the original licensor liable for
148
damages; iii) software code available under BSD-style license can easily be Library Automation:
Application of Open Source
incorporated into commercial applications; and iv) BSD-style licenses do not Software
require the distribution of source code (after modification of original code). These
two major licenses may be compared against the following features in the context
of distributing open source software –
GPL BSD
Licensed Licensed
Must distribute original source code Yes No
Must distribute user-created source code Yes No
User-created source code must be available under GPL Yes No
Proprietary Software linking possible No Yes
Compatible with GNU GPL Yes No*
*The original BSD license is not GPL compatible but the modified BSD license is compatible
with GPL.
The W3C (2006) provides a set of six pack criteria in defining Open Standards:
• transparency (due process is public, and all technical discussions, meeting
minutes, are archived and citable in decision making);
• relevance (new standardisation is started upon due analysis of the market
needs, including requirements phase, e.g. accessibility, multilinguism);
• openness (anybody can participate, and everybody does: industry, individual,
public, government bodies, academia, on a worldwide scale);
• impartiality and consensus (guaranteed fairness by the process and the neutral
hosting of the W3C organisation, with equal weight for each participant);
• availability (free access to the standard text, both during development and
at final stage, translations, and clear IPR rules for implementation, allowing
open source development in the case of Web technologies); and
• maintenance (ongoing process for testing, errata, revision, permanent access).
149
Library Automation Software development, as a process, depends on standards (de jury/de facto or
proprietary/open) in each step. Open standards provide following advantages –
1) free to apply for any lawful purposes; 2) open and collaborative process of
development; 3) well documented and no chance of data loss due to technical
obsolescence. The visible disadvantages of open standards are – 1) availability
of only a few major players (e.g. Loc, IFLA etc.); 2) lack of coordination between
open standard initiatives and open source software developers; and 3) non-
availability of open standards in many important facets of library activities (e.g.
exchange of bibliographic and authority data). Some of the well known open
standards that are in use in different library related software are – MARC 21
family of standards for resource description, MARC-XML as exchange format,
OAI/PMH as metadata harvesting standard, SRU/SRW as standards for web
based distributed searching etc.
Digital Library Federation (2004) of USA considers and advocates use of OSS
in libraries in its draft report on the basis of following reasons –
• OSS is an economical alternative to libraries’ reliance upon commercially
supplied software. It means that the real costs involved in the development,
maintenance, and use of OSS software are lower than those associated with
commercial software (license, upgrading and maintenance fees);
• With OSS, the IT infrastructure for library operations and services can be:
– Open, that is, built according to open standards and as such potentially
interoperable with other software and systems;
– Ubiquitously available to libraries and can be tailored to suit the needs
and circumstances of individual libraries;
– Documented (and documentation is accessible to all); and
– Modified and corrected more effectively (“many eyeballs make bugs
shallow”).
The above factors and advantages as identified by experts are responsible for
increasing use of open source software in different libraries. Open source is a
boon for libraries in developing countries like India. Now small libraries, which
cannot afford costly ILS can opt for library automation with the availability of
open source software.
(* ABCD and WEBLIS are based on CDS/ISIS which is a close source textual DBMS developed
by UNESCO and available free of cost)
Most of the LMSs listed above are in their infancy. The mature LMS block
includes Koha, Emilda, Evergreen, NewGenLib, WEBLIS and PHPMyLibrary.
Koha, the first open source library management software, has created a high
level of interest in library profession for open source movement internationally.
Koha (in Maori language Koha means an unconditional gift) is a full-featured
open-source ILS. Developed initially in New Zealand by Katipo Communications
Ltd and first deployed in January of 2000 for Horowhenua Library Trust, Koha
is currently maintained by a team of software developers and library technology
staff from around the globe.
• YAS Toolkit: YAS toolkit implements Z39.50 standard and protocol to both
the origin and target .(FLOSS based Dependencies: None); URL: http://
www.indexdata.dk/yas/
4.8 SUMMARY
This Unit covered what and why of open source software in general. It also
discusses history of open source movement including philosophy, principles and
licensing of open source software. Most of the library experts are in opinion that
open source software has all the potential to change the way libraries deal with
the software. Library automation process is greatly influenced by the applications
of open source software and open standards. OSS can provide a viable alternative
to commercial ILSs. This unit examined the use of open source software in
libraries at two different levels – system level and task level. In system level
161
Library Automation LAMP architecture is prevailing in many libraries. At the task level libraries are
fortunate to have open source ILSs, open source digital library software, open
source cataloguing tools and many more. This unit also discusses problems of
open source software in general and issues related with the use of open standards
in developing OSS. Finally, it predicts the emergence of open library systems
with three interrelated components – open source, open standards and open
contents.
4) The culture of open source software started with the Internet in 1969 in the
name of shareware or free software. The movement gained momentum with
the establishment of Free Software Foundation (FSF) by Richard Stallman
in 1985. But the term open source itself has been a relative latecomer.
Christine Peterson of the Foresight Institute proposed the term open source
in late 1997. Open source software are fundamentally different from
shareware, public-domain software, freeware that are made freely available
without access to source code.
6) As per the distribution policy, the whole array of software may be categorised
into three groups – Commercial software, Freeware, and Open source
software. In case of commercial software only binary code (or executable
code) is available against fees. Whereas freeware are available at no cost
with binary code. In both of these cases source codes are not available with
software and therefore customisation activities are not possible. But open
source software includes both source code and binary codes at no cost. It
supports modification of source code and distribution of source code against
license.
7) Open Source Initiative (OSI) set aside ten criteria in 2006 for a software
product to be called open source software. These ten criteria are popularly
known as Ten Commandments of open source. These are – 1) Free
redistribution of software; 2) Availability of Source code; 3) Derived works
also available as open source; 4) Integrity of the author’s source code; 5)
No discrimination against persons or groups; 6) No discrimination against
fields of endeavor; 7) Distribution of license; 8) License must not be specific
to a product; 9) The license must not restrict other software; and 10) The
license must be technology-neutral.
9) Open source software are available with attached licenses. The licenses
provide freedom to study, customise and redistribute open source software.
Licensing issues related with open source software are complex in nature.
Open source software are released under a variety of different licenses. Study
shows that there are more than 60 licenses. These licenses are grouped under
163
Library Automation eight categories by OSI. However, an in-depth analysis shows that there are
only two primary types of licenses and countless variants are based on these
two widely adopted licenses. These two main licenses are the GNU (recursive
acronym for GNU’s not Unix) General Public License (GPL) and the BSD-
style licenses.
11) The major advantages of open source software are – 1) freedom to incorporate
changes as required by an individual library; 2) no vendor lock-in and
freedom to hire technical expertise from outside; and 3) better software
development model (continuous upgrading, scope to contribute as co-
developer and global professional fraternity). The disadvantages associated
with open source applications are – 1) steep learning curve; 2) non-
availability of in-house technical expertise; 3) no on-call and on-site technical
support.
14) Libraries all over the world are passing through a rapid phase of development.
Sometimes technologies demand fundamental changes in library operations
and services. Moreover, libraries are now operating in a distributed global
networked environment. It’ s no more possible for a library to serve in stand-
alone mode. On the other hand, the volumes and varieties of user demands
are increasing day-by-day. As a result, libraries reliance upon open standards
and open source software is also increasing to satisfy growing
multidimensional need of users and systems because open source software
164
are adapting new technologies and architecture rapidly in compare with Library Automation:
Application of Open Source
commercial software. Moreover, the age-old software development model Software
followed by most of the commercial ILSs is not adequate for modern library
activities. As a result library automation and digitisation programs are
increasingly using open source software for different library activities.
15) There are many open source software for different library activities. This is
another facility in the open source domain, one particular area of activity
includes many open source software. For example, the domain of library
automation includes a total of 14 open source software. Koha is web-enabled
open source ILS based on LAMP architecture meant for library automation
activities. Evergreen is client-server architecture based open source ILS
meant for automation of a group of libraries and useful for developing union
catalogues in a library network setup. Another major open source ILS is
NewgenLib developed in India. It uses open source companion software
like PostGreSQL as RDBMS, Apache-Tomcat as java servlet engine and
Java SDK as programming environment.
16) In the open source domain, like open source ILSs, there are many open
source digital media arching software. This domain of open source digital
library software can be categorised into two basic groups – 1) Centralised
processing – Distributed access architecture; and 2) Distributed process and
distributed access architecture. In the first group, the most comprehensive
one is Greenstone Digital Library Software and Dspace is the most popular
software in the second group. Greenstone is written in PERL programming
language and supports archiving many digital formats. Dspace is using
PostGreSQL RDBMS, Apache-Tomcat and Java SDK.
4.10 KEYWORDS
API : Application Programming Interface. A language and
message format used by an application program to
communicate with the operating system or some other
control program such as a database management
system (DBMS).
Discovery application : A computer application designed to simplify, assist
and expedite the process of finding information
resources.
DNS : Domain Name Server, a service that resolves
symbolic host names into numeric IP addresses, and
vice versa.
Encoding : A character encoding scheme is a set of rules for
representing a sequence of character codes with byte
sequence.
ERMS : Electronic Resources Management System is used to
manage a library’s electronic resources, primarily e-
journals and databases. Systems can include features
to track trials, license terms and conditions, usage,
cost, and access.
165
Library Automation FOSS : Free/Open Source Software.
GNOME : GNU Network Object Modeling Environment, a
desktop environment based on GTK+ toolkit and
other desktop components.
GNU : A recursive acronym standing for “GNU’s system
based on Unix architecture.
I18N : Abbreviation for Internationalisation.
IIIMF : Internet/Intranet Input Method Framework, a new
framework for cross-platform input method
developed by OpenI18N.org. IIIMF bridges different
IM protocols by using wrappers that communicate
with a common protocol.
Interoperability : The ability for two different computer systems to
communicate and exchange information in a useful
and meaningful manner.
Kernel : A very low-level software that manages computer
hardware, multi-tasks the many programs that are
running at any given time, and other such essential
things.
L10N : Abbreviation for Localisation.
Localisation : Implementation of cultural conventions defined by
the internationalisation process according to different
languages and cultures.
Metadata harvesting : A technique for extraction of metadata from
individual repositories for collection into a central
catalog.
Multilingual : Supporting more than one language simultaneously.
Often implies the ability to handle more than one
script and character set.
Open Source : A concept through which programming code is made
available through a license that supports the users
freely copying the code, making changes it, and
sharing the results. Changes are typically submitted
to a group managing the open source product for
possible incorporation into the official version.
Development and support is handled cooperatively
by a group of distributed programmers, usually on a
volunteer basis.
OpenSearch : A collection of technologies developed by Amason
that allow publishing of search results in a format
suitable for syndication and aggregation.
OpenURL : A URL with stored metadata that is user context
sensitive in what information or hypertext link is
delivered.
166
Pango : A Unicode-based multi-lingual text rendering engine Library Automation:
Application of Open Source
used by GTK+ 2. Like GTK+, Pango is written in C Software
and licensed under LGPL.
PHP : A server-side scripting language for creating dynamic
web pages.
POSIX : Portable Operating System Interface Specification is
the minimum specification of system calls for
operating systems based on Unix, defined by IEEE
so that applications based on it are guaranteed to be
portable across OSs. Although based on Unix, POSIX
is also supported by some non-Unix OSs.
Protocol : A standard procedure for the message formats and
rules that two computer systems must follow to
communicate with each other
RSS : Really Simple Syndication is an XML format used
for distribution or syndication of frequently updated
Web contents.
Script : A system of characters used to write one or several
languages.
SSH : Secure Shell is used for remote login using an encrypted
connection to prevent sniffing by third parties.
System Analysis : A powerful technique for the analysis of an
organisation and its work.
UCS : Universal Multi-octet coded character set, as defined
by ISO/IEC 10646 to represent the world’s writing
systems. It is maintained by ISO/IEC JTC1/SC2/
WG2, with contributions from the Unicode Consortium.
Unicode : A universal character-encoding standard used for
representation of text for computer processing.
Unicode provides a unique numeric code (a code
point) for every character, no matter what the
platform, no matter what the program, no matter what
the language. The standard was developed by the
Unicode Consortium in 1999.
UTF-8 : Unicode (UCS) Transformation Format, using 8-bit
multibyte encoding scheme.
X Window : A graphical environment initially developed by the
Athena project at MIT with support from some
vendors, and later maintained by the X consortium.
X Window is the major graphical environment for
most Unix variants nowadays.
XML : EXtensible Markup Language is an open standard for
describing data from the World Wide Web
Consortium. It is used for defining data elements on
a Web page, business-to business documents, and
other hierarchically structured text and data. 167
Library Automation
4.11 REFERENCES AND FURTHER READING
A Brief History of Free/Open Source Software Movement <http://
www.openknowledge.org/writing/open-source/scb/brief-opensource-
history.html>
Digital Library Federation. The future is open: Digital libraries through open
source software (2004). <http://www.dlf. Org/Dlinitiatives/archiv/open.htm>
Marco, D., and Lister, S. Peopleware: Productive projects and teams. New York:
Dorset House Publishing, 1987. Print
Mukhopadhyay, P. Five laws and ten commandments: the open road of library
automation in India. Proceedings of the National Seminar on Open Source
Movement – Asian Perspective, XXII, Roorkee, 2006. Kolkata: IASLIC,2006.
pp. 27-36. Print
Open Source Initiative. Open source software certification process (2003). <http:/
/www.opensource.org/ osslicense.htm>
Open Source Initiative. OSI certified license: The ten basic criteria (2003). <http:/
/www.opensource.org/tencom.htm>
Open Source Initiative. Open source software and future of computing (2004).
<http://www.opensource.org/future.htm>
168
Raymond, E. S. Homesteading the noosphere (2001). http://tuxedo.org/ ~esr/ Library Automation:
Application of Open Source
writings/cathedral-bazaar/hacker-history.htm Software
Raymond, E. S. The cathedral and the bazaar: Musings on Linux and open
source by an accidental revolutionary .(Rev. ed). Cambridge: O’reilly and
Associates, 2001. Print
169
BLOCK 2 DIGITISATION AND DIGITAL
LIBRARIES– DSPACE AND
GSDL
Introduction
The automation of the library during past few decades have been mainly focusing on
creation of surrogate records of printed documents available in a library or for providing
services through secondary databases held locally on CD ROM or magnetic tapes.
The scope and functions of integrated library packages, till recently, were essentially
restricted to providing access to documents at bibliographic level. The new versions
of, integrated library packages, however, tend to provide additional features and
functionalities akin to digital libraries. However, since the automated systems till recently
provided only bibliographic information, users had to depend heavily on physical
collection available either in their institutional library or on inter-library loan from
other libraries for references retrieved from the secondary services.
Digitisation is the process of converting the content of the physical media (text, audio,
video) into digital media. For printed material an image of the physical object is
captured using a scanner or digital camera and converted into a digital format that
can be stored electronically and accessed via computer or mobile devices. For audio
and video material encoders are used for digitisation.
Once document and media content are digitised, these need to be archived and
made accessible to the users. For this, tools for organising digital collection are needed.
DSpace and Greenstone Digital Library Software are two major application being
used by libraries world over to organising digital collection and building digital libraries.
This block has four Units. Unit 5 on Introduction to Digital Library provides an
overview on the concept of digital library and major worldwide initiatives. Unit 6
discusses the Digitisation Process. Units 7 and 8 deal with Creating Digital Libraries
Using D-Space and GSDL respectively.
Digitisation and Digital
Libraries – DSpace and
GSDL
4
Introduction to Digital
UNIT 5 INTRODUCTION TO DIGITAL Library
LIBRARY
Structure
5.0 Objectives
5.1 Introduction
5.2 Concept
5.3 Types of Digital Libraries
5.4 Major Digital Library Initiatives
5.5 Future Trends
5.6 Summary
5.7 Answers to Self Check Exercises
5.8 Keywords
5.9 References and Further Reading
5.0 OBJECTIVES
After going through this Unit, you will be able to:
• understand the basic concept, and need for digital libraries;
• explain different types of digitisation; and
• discuss future trends of digital libraries.
5.1 INTRODUCTION
Digital age has brought a tremendous change in the way information is stored and
accessed. It is marked by three distinct features: abundance, currency and easy
access of information. This has brought about a change in the concept of libraries,
their collection and services. Many new terms viz., ‘digital libraries’, libraries without
walls’, ‘virtual libraries’ are emerging to describe the libraries of present day age.
The term ‘digital library’ is a shift from the earlier term electronic library which was
used for the last two decades to describe the book-less library which relies on
telecommunication and computers to provide users with whatever information they
need. A digital library is popularly viewed as an electronic version of a library where
storage is in digital form, allowing direct communication to obtain material and copying
it from a master version. It combines technology and information resources to allow
remote access, breaking down the physical barrier between resources. In Wilensky’s
view “the digital library will be a collection of distributed information services, producers
will make it available, and consumers will find it through the automated agents”. In
this model it appears that the traditional libraries will have no role to play. How far
this will be true only time can tell.
In the early stages of development of digital libraries the main focus was on providing
dial up access to Online Public Access Catalogues (OPAC). The term however
evokes different meaning for different people. To some it may simply mean
computerisation of the traditional library system. To those with library science
5
Digitisation and Digital background it means doing things in a new way, using new type of information
Libraries – DSpace and
GSDL
resources, new approach to acquisition, new methods of storage and preservation,
new approaches to classification and cataloguing, new ways of interaction with the
patrons with more reliance on electronic system and networks. As it stands today,
most libraries in the developed countries have their own homepages providing links
to local information, electronic databases, bibliographic as well as full text, apart
from its own online system of collection and services.
Digital libraries in future will not be a standalone version. The explosive growth in
networked connectivity and rapid advances in computing power are replacing the
older notions of standalone information utilities with newer notions of integrated digital
libraries. The integrated digital library creates a shared environment linking everything
from personal collection, collection of conventional libraries and large databases
spread all over the world.
In the recent years the term ‘virtual library’ is becoming more popular. It is being
used to describe libraries that provide access to digital information using variety of
networks, specifically the internet and the World Wide Web, irrespective of place
and time. According to Gilbert “it is an aggregate of libraries or literature bases, the
catalogue or bibliographies of which are accessible electronically (e.g. with a personal
computer) and of which some may offer document ordering and delivery services.
The center of the virtual library is by definition the individual user, or his/her work
station”. Thus in the present day context virtual library is the convergence of a number
of concepts: electronic browsers, online catalogues and literature bases, and
empowerment of the end users.
In Toren and Czech’s view, libraries in future will become icons on the screen and
library buildings will function as book warehouses. The future implication of such a
situation needs to be contemplated seriously.
5.2 CONCEPT
Defining Digital Libraries
The term “digital library” is the most recent in a long series of names for a concept
that has been written about nearly as long as the development of the first computer:
a computerised “library” that would supplement, adds functionality, and even replaces
traditional libraries.
Digital libraries necessarily include a strong focus on the management of digital content,
just as traditional libraries have focused for long on the management of content in
physical forms. Most of the digital content that is being managed includes human
language, either in the form of character-coded electronic text, scanned versions of
6
printed or handwritten text, or digital representations of human speech. Language Introduction to Digital
Library
technology therefore plays a major role in managing digital content. This comes as no
surprise, of course. Digital libraries today make good use of what we know about
searching large collections, and techniques such as machine-assisted indexing are
employed increasingly often as we strive to extend our reach to progressively larger
collections. But we are on the verge of a new era, one in which our machines will
learn from what we do and then apply those capabilities to enable the management
of digital content at a far larger scale than we could ever hope to do ourselves.
Few advantages of digital libraries according to Haddouti are:
• User can access the information anywhere
• Reduces bureaucracy by providing access to the information
• The information is not necessarily located in same place
• Understanding the catalogue structure is not necessary
• Cross references to other documents speed up the work of users
• Full text search
• Protected information source
• Wide exploration and exploitation of the information
The knowledge dissemination is an integral part of success story of popularity of
creating digital libraries. The aim is to provide universal access to human knowledge,
and given the advancement of digital storage and communications this goal is now
achievable.
Distributed Models
Libraries are increasingly adopting distributed models for information access and
management, and more often use open and collaborative models for developing
library content and services. With the incorporation of open models and distributed
technologies, the libraries have the potential to get more involved in knowledge
creation, dissemination, and use. In reference to libraries, the creation and dissemination
of knowledge—in ways that represent the library’s contributions more broadly and
that intertwine the library with the other stakeholders in these activities. The library
becomes a collaborator within the academy, yet retains its distinct identity.
In this second phase in the evolution of library roles, the library starts to engage in
collaboration as a strategy to address its core mission of building collections,
maintaining access, and providing service. As responsibilities for content and services
become more distributed, models of central control give way to new mechanisms for
coordination and collaboration. Ultimately, the processes of scholarly communication
become as critical as traditional publication products.
Digital libraries can be grouped in different ways. They can be classified by origin,
such as digital libraries developed in the USA as part of DLI 1 and DLI 2 (the Digital
Library Initiatives), digital libraries developed in the course of the eLib (Electronic
Libraries) programme in the UK, digital libraries built by individual institutions, digital
libraries that are part of national libraries, digital libraries that are part of universities;
or by period, by country of origin, and so on.
• early digital libraries, e.g. ELINOR, Gutenberg
• digital libraries of institutional publications, e.g. ACM, IEL
• digital library developments at national libraries, e.g. the British Library, Library
of Congress (THOMAS), Digital Library of Canada
• digital libraries at universities, e.g. Berkeley Digital Library SunSITE Bodleian
Library Digital Library Projects, California Digital Library, DIGILIB, iGEMS
and SETIS
• digital libraries of special materials, e.g. Alexandria, Informedia, Grainger
Engineering Library
• digital libraries as research projects, e.g. GDL, NCSTRL, NDLTD
• digital libraries as hybrid library projects, e.g., HeadLine.
11
Digitisation and Digital • The IEEE Electronic Library
Libraries – DSpace and
GSDL (http://www.ieee.org/portal/innovate/products/research/ieee_iel.html)
The IEEE digital library is the gateway to valuable, cutting-edge research,
standards and educational courses with more than two million articles. It offers
100% full-text searchable content with full-page PDF images of all IEEE articles,
papers and standards.
12
• The New Zealand Digital Library Project (http://nzdl.sadl.uleth.ca/cgi-bin/ Introduction to Digital
Library
library.cgi)
The New Zealand Digital Library Project is a research programme at the
University of Waikato. The main objective of this project is to develop the
underlying technology for digital libraries and make it available publicly.
13
Digitisation and Digital • Perseus Digital Library (http://www.perseus.tufts.edu/hopper/)
Libraries – DSpace and
GSDL Perseus is an evolving digital library, to bring a wide range of source materials to
as large as audience as possible.
14
• The Berkeley Digital Library (http://sunsite.berkeley.edu/) Introduction to Digital
Library
The Berkeley Digital Library project began as an inter-agency, academic teaming
to research collaboration techniques. It continues and in currently developing
the tools and technologies to support highly improved models of the “scholarly
information life cycle”. The goal is to facilitate the move from the current
centralized, discrete publishing model, to a distributed continuous, and self-
publishing model. It provide access to a large variety of scholarly publications.
15
Digitisation and Digital • The Networked Digital Library of Theses and Dissertations (NDLTD)
Libraries – DSpace and
GSDL (http://www.ndltd.org/)
The Networked Digital Library of Theses and Dissertations is an international
organisation dedicated to promoting the adoption, creation, use, dissemination
and preservation of electronic analogueues to the traditional paper-based theses
and dissertations. This contains information about the initiative, how to set up
Electronic Thesis and Dissertation (ETD) programmes, how to create and locate
ETDs, and current research in digital libraries related to NDLTD and ETDs.
16
• The University of Adelaide Digital Library (http://digital.library.adelaide. Introduction to Digital
Library
edu.au/)
The Digital Library undertakes projects aimed at enhancing online access to
information for their members. This provides access to exam papers available
online, Australian digital theses collection and e-books available at Adelaide.
17
Digitisation and Digital • The Cuneiform Digital Library Initiative (CDLI) (http://cdli.ucla.edu/)
Libraries – DSpace and
GSDL The Cuneiform Digital Library initiative represents the efforts of an international
group of Assyriologists, museum curators and historians of science to make
available through the internet the form and content of cuneiform tablets dating
from the beginning of writing until the end of the pre-Christian era.
• UQ eSpace (http://espace.library.uq.edu.au/)
UQ eSpace is the University of Queensland’s institutional digital repository for
publications, research, and teaching materials. Deposited material covers a very
wide range of subjects and disciplines. This also holds the electronic full text of
many peer-reviewed published articles and conference papers, book chapters,
theses and other forms of written research from UQ academic staff and students.
18
• Traditional Knowledge Digital Library Introduction to Digital
Library
(http://www.tkdl.res.in/tkdl/langdefault/common/home.asp?GL=Eng)
The Traditional Knowledge Digital Library is a well known Indian digital library
initiative being implemented by the National Institute of Science Communication
and Information Resources (NISCAIR). The major objective is to provide
information on the Indian system of medicine such as Ayurveda, Unani, Siddha,
Yoga, Naturopathy and Tribal Medicine.
19
Digitisation and Digital • The Archives of Indian Labour (http://www.indialabourarchives.org/)
Libraries – DSpace and
GSDL The Archives of Indian Labour is a collaborative project of V.V.Giri National
Labour Institute and the Association of Indian Labour Historians. The main
objective is to preserve and make accessible archival documents on the working
class of India.
There are numerous areas of research related to the historic interests of the digital
library community that are at the crossroads of technology and social science and
which will demand investment and attention in the coming years; many of these are
natural extensions and elaborations of the collaborations initiated by the past decade
of digital library research programs. Below mentioned are some of the driving force
areas for future of digitisation
20
• Personal information management. As more and more of the activities in our Introduction to Digital
Library
lives are captured, represented and stored in digital form, the questions of how
we organize, manage, share, and preserve these digital representations will
become increasingly crucial. Among the trends lending urgency to this research
area are the development of digital medical records (in the broadest sense), e-
portfolios in the education environment, the overall shift of communications to
email, and the amassing of very large personal collections of digital content
(text, images, video, sound recordings, etc.)
• Long term relationships between humans and information collections and systems.
This is related to personal information management, but also considers
evolutionary characteristics of behaviour, systems that learn, personalization,
system to system migration across generations of technologies, and similar
questions. This is connected to human-computer interface studies and also to
studies of how individuals and groups seek, discovers, use and share information,
but goes beyond the typical concerns of both to take a very long time horizon
perspective.
• Active environments for computer supported collaborative work offer the starting
point for another research program. These environments are called for, under
the term “colaboratories”, by the various cyber infrastructure and e-science
programs, but have much more general applicability for collaboration and social
interactions. From one perspective, these environments are natural extensions
of digital library environments, but at least some sectors of the digital library
community have always found active work environments to be an uncomfortable
fit with the rather passive tradition of libraries; perhaps here the baggage of
“digital libraries” as the disciplinary frame is less than helpful. But there is a rich
research agenda that connects literatures and evidence with authoring, analysis
and re-use in a much more comprehensive way than we have done to date; this
would consider, for example, the interactions between the practices of scholarly
authoring and communication on one hand, and on the other, the shifting practices
of scholarship that are being recognized and accelerated by investments in e-
science and e-research.
5.6 SUMMARY
Libraries have always played a significant role in society, and digital libraries with the
promise of breaking the barriers of geographical distance, language and culture, have
a potentially even more significant social role. Digital libraries will not only change
our reading and information use habits, they are also going to bring major changes in
the economic models of information generation, distribution and management functions.
A tremendous amount of research and development activity has gone into the study
of digital libraries. Many issues have been addressed and problems have been partly
or fully resolved. Researchers from a variety of disciplines, such as library and 21
Digitisation and Digital information science, computer science and engineering, social sciences and humanities
Libraries – DSpace and
GSDL
are working closely together to look into the myriad of unresolved issues.
For exploiting the benefits of Digital Library in Indian languages there is urgent need
of tools and applications such as OCRs and Machine Translation systems so that
user can take benefit of reading rare classics published in any language and researchers
are able to use these tools for their linguistic research. This parallel aligned corpus
development is first attempt in context of Indian languages. This is the initiation of
several efforts which will follow the trend of enhancing the research in the field of
Computational Linguistics. The parallel corpus as a Translation Memory (TM) will
be a valuable source in improving the translation system and translators’ efficiency.
It will boost the development of Lexical and Terminology databases with the
combination of Quantitative and Qualitative Analysis of Text. Text Analyzer is a new
kind of tool which is helpful in lexicography, knowledge acquisition, language and
writing variation studies. Digital libraries creation have been a good test bed for
OCR’s and now that the world is moving towards speech to speech translation all
these tools together will help building one for Indian languages.
5.8 KEYWORDS
Hybrid library : Libraries containing a mix of traditional
print library resources and the growing number of
electronic resources.
OCR : Optical Character Recognition, or OCR, is a
technology that enables you to convert different
types of documents, such as scanned paper
documents, PDF files or images captured by a
22 digital camera into editable and searchable data.
Open Knowledge Initiative : The Open Knowledge Initiative (O.K.I.) is an open Introduction to Digital
Library
and extensible architecture for learning technology
specifically targeted to the needs of the higher
education community.
Open Source Movement : A broad-reaching movement of individuals who
support the use of open source licences for some
or all software. Open source software is made
available for anybody to use or modify, as
its source code is made available.
23
Digitisation and Digital
Libraries – DSpace and UNIT 6 DIGITISATION PROCESS
GSDL
Structure
6.0 Objectives
6.1 Introduction
6.2 Digitisation of Print Based Documents
6.2.1 Capturing Print Based Document
6.2.2 Digitising
6.3 Video Digitisation
6.3.1 Video Capturing
6.3.2 Video Digitisation Process
6.4 Audio Digitisation
6.4.1 Audio Capturing
6.5 Audio/Video Compression
6.6 Audio/Video Streaming
6.7 File Formats and Content Creation
6.8 Summary
6.9 Answers to Self Check Exercises
6.10 Keywords
6.11 References and Further Reading
6.0 OBJECTIVES
After going through this Unit, you will be able to:
• Understand the digitisation process of text, audio and video;
• Know different types of file formats; and
• Explain the file compression process.
6.1 INTRODUCTION
A digital library may contain materials that are born digital, such as e-journals and e-
books, or may contain materials that were originally produced in another form but
subsequently digitised. The process of digitising materials involves different steps
depending upon material, technology and requirement. Various technical issues, like
hardware and software, file formats and file compression and then the post processing
requirements for making the digitised file accessible to end-user will be discussed.
Scanning technology has improved considerably over the years in terms of speed
and resolution. There are several types of scanning devices available in the market
now. Scanners come in three broad price ranges: i) low cost flatbed scanners or
hand held devices, ii) low end sheet feeder type, iii) high end professional or book
scanners. Scanning machines are generally based on Charge Couple Device (CCD)
technology. In low end devices Contact Image sensor (CIS) technology is used
generally whereas in some high end devices Photo Multiplier Tube (PMT) technology
is used. PMT based drum scanners produce very high quality images which come at
a high cost. CMOS (Complementary Metal Oxide Semiconductor) is another sensing
technology that is used in hand held digital cameras.
The scanners operate by shining light on the document and directing the reflected
light through a series of mirrors and lenses onto photo sensitive element. The photo
sensitive element could be CCD, CIS or PMT based technology depending on the
type of the scanners. Light sensitive photosites arrayed along the photosensitive
element are converted into electronic signals which finally processed into digital image.
Fill in the information for device, format and destination in the dialogue box that
appears
26
Digitisation Process
To scan the documents click on the Scan All option. From the Minolta PS7000
Scanner Setup Dialog Box that appears.
Click on Done option from the Minolta PS7000 Scanner Setup Dialog Box
which shows the file like this:
27
Digitisation and Digital Save the file as PDF version giving .pdf extension. To change the resolution, Click on
Libraries – DSpace and
GSDL
Scan Setting >> Resolution (DPI) from the Minolta PS7000 Scanner Setup Dialog
Box. To change the Scan Area click on, Scan Setting >> Scan Area. You can also
change the Brightness and Contrast of the scanned file by using the drag button from
the right panel. If you want to change the Image Type then click on Scan Setting >>
Image Type. You can also change the Brightness and Contrast of the scanned file by
using the drag button from the right panel. Scanned pages can be saved as individual
files or as a complete document by appending them to the current document while
scanning.
6.2.2 Digitising
The process of digitisation involves capturing the physical or analogueue object through
devices like scanners, digital camera, recorder etc., converting them into numerical
values in bits and bytes which enables them to be read electronically.
Digitisation of text is possible either through text transcription or using optical character
recognition method. Text transcription can be through keying in the text using a
keyboard or by voice recognition software. Keyed in text are saved in ASCII format
which do not replicate the structure and format of the original text.
OCR software converts image of text captured by a scanner into computer editable
text which a word processor can read. The software tries to match the image of each
letter against the pattern it recognizes making use of the stored knowledge about the
shapes of individual characters. The OCR software has options for either storing the
text and graphics in their original layout or converting them into ASCII or word
processing format. Omnipage Pro and ABBYY Fine Reader are two commonly
used OCR software.
After OCR, you can export the resulting text to a variety of word-processing, page
layout, and spreadsheet applications. It also provides the option to save it directly as
a PDF file.
Start button with 1-2-3 selected in the Workflow drop-down list. Your pages will be
acquired, auto-zoned and recognized one after the other. Proofing will start if you
requested it. When proofing and/or recognition are finished, an export dialog box
appears. Select the destination, file type and file name to save the file.
To manually perform the OCR, follow the steps given below.
1) Scan the document as an image
• Launch Omni Page Pro. Start>Programs>Scansoft Omnipage Pro
• The Program will open with the toolbar shown below.
29
Digitisation and Digital You can skip this step if you want OmniPage to automatically perform the
Libraries – DSpace and
GSDL
OCR and select the regions.
• Then choose the file to convert OCR by clicking on the 123 option (Start
Button).
• It will give this type of screen to browse the file from any location.
It also gives suggestions from its built in spell checker. If OmniPage Pro does
not recognise some words in the document, the OCR Proofreader window will
30 appear. Choose the appropriate response to each unrecognized word.
Digitisation Process
5) Save as a file
• To do this click on the icon above the Save to File menu.
• Choose the location to save it at and give it a file name and select the file
type to save it as. Now you can save the file in the available format you
want. The typical formats available are MS-Word document *.doc, PDF
*.pdf, HTML *.html, Text *.txt
• Enter your desired file name in the File Name text field.
• You can choose a document format from the Files of type pull-down menu.
The default selection of RTF Word (*.rtf) is highly recommended, as it can
be opened by most of the word processing programs.
• Click OK to save the file.
31
Digitisation and Digital
Libraries – DSpace and 6.3 VIDEO DIGITISATION
GSDL
Analogue mediums such as vinyl, VHS cassettes, and TVs have now been replaced
by superior digital medium, such as CDs, DVDs, and HDTVs. The digital medium
provides higher quality content. It also allows exact reproduction from copy to
copy, barring any encryption technology implemented to stop copying.
Digital video refers to video being viewed or manipulated in the digital system (for
instance on a computer), or sometimes simply video stored in a digital tape format.
The video may have originally been analogue source material digitised into a
computer, or it may have been stored directly to a digital tape format. Traditionally,
digital tape formats were only available at the professional level (D-1, Digital
Betacam, etc.), but now that some digital tape formats (DV) have emerged on the
consumer scene, there is even more confusion about the generic term “digital
video.”
As we have moved into the 21st Century, traditional analogue mediums such as
vinyl, VHS cassettes, and TVs are being replaced by superior digital ones, such
as CDs, DVDs, and HDTVs. Not only does digital formats allow for higher
quality content, but also allows exact reproduction from copy to copy, barring any
encryption technology implemented to stop copying. As computers become faster
and disk storage space becomes larger, users are able to more deftly manipulate
their digital data taken from analogue mediums and frequently “improve” the original
analogue content using various techniques in the digital world.
System Requirements for a beginner multimedia processing system:
• x86-based PC @ 800+Mhz
• 256+MB RAM
• 40+GB of Free HD space (7200 rpm drive)
• Microsoft Windows98/ME/2000/XP
• Sound card with Line-in
• Video Capture card
These are the minimum requirements to perform reliable video capture. It is entirely
possible to do video capture with less than this configuration, but good results
cannot be guaranteed. Obviously, a faster CPU, more RAM, and more HD space
are nothing but a good thing. Windows 9x/ME users should be aware that the
FAT32 file system has a limitation preventing files from being larger than 4GB.
32
Windows machine is strongly recommended since the NTFS file system has no Digitisation Process
such file size limitation.
2) Virtual DUB
Virtual Dub is an open source video capture/processing utility for 32-bit
Windows platforms, licensed under the GNU General Public License (GPL).
It lacks the editing power of a general-purpose editor such as Adobe Premiere,
but is streamlined for fast linear operations over video.
33
Digitisation and Digital
Libraries – DSpace and
GSDL
3) FFmpeg
It is a complete Open Source, cross-platform solution to record, convert and
stream audio and video. It includes libavcodec - the leading audio/video
codec library.
34
Digitisation Process
When high-quality streaming along with a very low bandwidth is our priority,
Flash Media Live Encoder 3 can help you broadcast live events and around-
the-clock broadcasting such as:
• Sporting events
• Concerts
• Webcasts
• News
• Educational events
36
Self Check Exercise Digitisation Process
Text Plain Text Files (*.txt) ASCII text files viewed with an editor (such as Edit
formats text or Notepad) or with a Word Processor (such as
MS Word). Do not contain any kind of formatting
on the document (such as bold, italics, font colour,
images, etc.).
Formatted 1. doc or odf Document files created, viewed and edited using
text programs such as MS Word or OpenOffice Writer.
Formatting features such as bold, italics,
justification, adding bullets and numbering, etc., is
possible in such formats.
39
Digitisation and Digital Table 6.2: Common Formats
Libraries – DSpace and
GSDL Format File Notes
Extension
XML .xml An XML file, validated with DTD or schema
specified, is a format suitable for preservation.
SGML .sgml.sgm A SGML file, validated, with DTD specified,
is suitable for preservation.
HTML .htm, .html Hypertext markup language file, which may
in principle be validated against a DTD. In
practice invalid documents are often produced
and used.
XHTML .xhtml, .htm, XML-conformant HTML file, is required to
.html be well-formed and valid.
DTD .dtd Document Type Definition. Defines the rules
and syntax applied to a document. To be
supplied with an SGML or XML document.
XML Schema .xsd An XML schema file. Defines the rules and
syntax applied to a document. To be supplied
with an XML document.
Pseudo-SGML .sgm, .sgml. A text file employing some SGML-like
.txt or other formalisms for inserting markup, but not valid
SGML. Suitability depends on whether
tagging is consistently applied and well-
documented, sufficient for later migration.
Various non-SGML .txt or other Suitability depends on acceptance as de facto
encodings in standard in an academic community, plus an
text files assessment of its likely future viability and
level of documentation
6.8 SUMMARY
The conversion of analogue sources into digital form and their appropriate storage
and processing form an important part of building a digital library. Digitisation is
a complex process requiring managerial and technical skills. Proper planning and
management help in keeping the cost down, and they also lead to the successful
completion of a digitisation project. Digitisation can be carried out in-house or
outsourced.
Various technical issues need to be considered in a digitisation project ranging
from hardware to software and standards for file formats, file compression and
post-processing. Selection of metadata format depends on the nature of the
documents as well as the nature and needs of the users.
6.10 KEYWORDS
Charge-coupled device (CCD) : A device for the movement of
electrical charge, usually from within
the device to an area where the charge can
be manipulated, for example conversion into
a digital value.
Contact Image Sensors (CIS) : Relatively recent technological innovation in
the field of optical flatbed scanners that are
rapidly replacing CCDs in low power and
portable applications.
Photomultiplier Tubes (PMT) : Members of the class of vacuum tubes, and
more specifically vacuum phototubes, are
extremely sensitive detectors of light in the
ultraviolet, visible, and near-infrared ranges
of the electromagnetic spectrum.
41
Digitisation and Digital
Libraries – DSpace and UNIT 7 CREATING DIGITAL LIBRARIES
GSDL
USING DSPACE
Structure
7.0 Objectives
7.1 Introduction
7.2 Functional Features of Dspace
7.3 Installing Dspace on Windows
7.4 Working with Dspace
7.5 Summary
7.6 Answers to Self Check Exercises
7.7 Keywords
7.8 References and Further Reading
7.0 OBJECTIVES
After going through this Unit, you will be able to:
• Describe the functional features of DSpace;
• Install Windows version of Dspace; and
• Create digital library using DSpace.
7.1 INTRODUCTION
DSpace is open source software, a turnkey repository application used by more
than 1000+ organisations and institutions worldwide to provide durable access to
digital resources. In India more than 140 institutions are using DSpace for building
digital repositories.
DSpace is a software platform that enables organisations to:
• capture and describe digital material using a submission workflow module, or
a variety of programmatic ingest options.
• distribute an organisation’s digital assets over the web through a search and
retrieval system.
• preserve digital assets over the long term.
The DSpace project was initiated in July 2000 as part of the HP-MIT alliance.
The project was given $1.8 million USD by HP over two years to build a digital
archive for MIT that would handle the 10,000 articles produced by MIT authors
annually. DSpace has gone through several versions and the current stable release
available is version 4.2.
42
The code for DSpace is kept within a source code control system (http:// Creating Digital Libraries
Using DSpace
dspace.svn.sourceforge.net/viewvc/dspace/) that allows code to be added or
modified over time, whilst maintaining a track of all changes and a note of why the
change was made and who made it. The Control of the source code repository
is delegated to a small group of ‘committers’ who have the ability to change the
code and release new versions. The committers work with the wider community
of DSpace users to fix bugs and improve the software with new features.
In this we will guide you through the process of installation of DSpace (on a
window platform) and familiarise you with the process of using and building collection
in Dspace.
The Unit has been adapted from the DSpace official documentation and the
Courseware developed by Aberystwyth University. Both the documents are available
under the terms of either the GNU General Public License (http://www.gnu.org/
licenses/gpl.html) and the Creative Commons Attribution License (http://
creativecommons.org/licenses/by/4.0/), for distribution and modification. The
documents used are listed in the References and Further Readings section for
further reference and you may refer them for further details.
Full-text search : DSpace can process uploaded text based contents for full-text
searching. Users may search for specific keywords that only appear in the actual
content and not in the provided description.
Navigation : Users in DSpace find their way to relevant content through:
• Searching for one or more keywords in metadata or extracted full-text
• Faceted browsing through any field provided in the item description.
• Through external reference, such as a Handle
• Browse is another important mechanism for discovery in DSpace, whereby
the user views a particular index, such as the title index, and navigates around
it in search of interesting items.
Supported file types : While DSpace is most known for hosting text based
materials including scholarly communication and electronic theses and dissertations
(ETDs), it can accommodate any type of uploaded file. Files uploaded on DSpace
are referred to as “Bitstreams” as after ingestion, files in DSpace are stored on the
file system as a stream of bits without the file extension.
Optimized for Google Indexing : For the Google Scholar indexing, DSpace has
added specific metadata in the page head tags that facilitates indexing in Scholar.
Popular DSpace repositories often generate over 60% of their visits from Google
pages.
43
Digitisation and Digital OpenURL Support
Libraries – DSpace and
GSDL DSpace supports the OpenURL protocol through linking server software called SFX
server. DSpace will display an OpenURL link on every item page, automatically
using the Dublin Core metadata if SFX server is implemented.
Metadata Management
DSpace holds three types of metadata about archived content:
• Descriptive Metadata: A qualified Dublin Core metadata schema loosely
based on the Library Application Profile set of elements and qualifiers is
provided by default. However, one can configure multiple schemas and
select metadata fields from a mix of configured schemas to describe items.
• Administrative Metadata: This includes preservation metadata, provenance
and authorization policy data.
• Structural Metadata: This includes information about how to present an
item, or bitstreams within an item, to an end-user, and the relationships
between constituent parts of the item.
Choice Management and Authority Control
This is a configurable framework that lets you define plug-in classes to control the
choice of values for a given DSpace metadata fields. It also lets you configure
fields to include “authority” values along with the textual metadata value. The
choice-control system includes a user interface in both the Configurable Submission
UI and the Admin UI (edit Item pages) that assists the user in choosing metadata
value.
Licensing
DSpace offers support for licenses on different levels:
• Collection and Community Licenses
• License granted by the submitter to the repository
• Creative Commons Support for DSpace Items
Persistent URLs and Identifiers
Researchers require a stable point of reference for their works. To help solve this
problem, a core DSpace feature is the creation of a persistent identifier for every
44
item, collection and community stored in DSpace. To persist identifiers, DSpace Creating Digital Libraries
Using DSpace
requires a storage- and location- independent mechanism for creating and maintaining
identifiers. DSpace uses the CNRI Handle System for creating these identifiers.
Similar to handles for DSpace items, bitstreams also have ‘Persistent’ identifiers.
They are more volatile than Handles, since if the content is moved to a different
server or organisation, they will no longer work (hence the quotes around ‘persistent’).
However, they are more easily persisted than the simple URLs based on database
primary key previously used. This means that external systems can more reliably
refer to specific bitstreams stored in a DSpace instance.
The batch item importer is an application, which turns an external SIP (an XML
metadata document with some content files) into an “in progress submission”
object. The Web submission UI is similarly used by an end-user to assemble an
“in progress submission” object.
When the Batch Ingester or Web Submit UI completes the In Progress Submission
object, and invokes the next stage of ingest (be that workflow or item installation),
a provenance message is added to the Dublin Core which includes the filenames
and checksums of the content of the submission. Likewise, each time a workflow
changes state (e.g. a reviewer accepts the submission), a similar provenance
statement is added. This allows us to track how the item has changed since a user
submitted it.
Once any workflow process is successfully and positively completed, the In Progress
Submission object is consumed by an “item installer”, that converts the In Progress
Submission into a fully blown archived item in DSpace. The item installer:
• Assigns an accession date
• Adds a “date.available” value to the Dublin Core metadata record of the item
45
Digitisation and Digital • Adds an issue date if none already present
Libraries – DSpace and
GSDL • Adds a provenance message (including bitstream checksums)
• Assigns a Handle persistent identifier
• Adds the item to the target collection, and adds appropriate authorization
policies
• Adds the new item to the search and browse index.
Workflow Steps
A collection’s workflow can have up to three steps. Each collection may have an
associated e-person group for performing each step; if no group is associated with
a certain step, that step is skipped. If a collection has no e-person groups associated
with any step, submissions to that collection are installed straight into the main
archive.
In other words, the sequence is this: The collection receives a submission. If the
collection has a group assigned for workflow step 1, that step is invoked, and the
group is notified. Otherwise, workflow step 1 is skipped. Likewise, workflow
steps 2 and 3 are performed if and only if the collection has a group assigned to
those steps.
When a step is invoked, the submission is put into the ‘task pool’ of the step’s
associated group. One member of that group takes the task from the pool, and
it is then removed from the task pool, to avoid the situation where several people
in the group may be performing the same task without realizing it.
The member of the group who has taken the task from the pool may then perform
one of three actions:
Workflow Step Possible actions
1 Can accept submission for inclusion, or reject submission.
2 Can edit metadata provided by the user with the submission,
but cannot change the submitted files. Can accept submission
for inclusion, or reject submission.
3 Can edit metadata provided by the user with the submission,
but cannot change the submitted files. Must then commit to
archive; may not reject submission.
The reason for this apparently arbitrary design is that is was the simplest case that
covered the needs of the early adopter communities at MIT. The functionality of
the workflow system will no doubt be extended in the future.
- SWORD Support
SWORD (Simple Web-service Offering Repository Deposit) is a protocol
that allows the remote deposit of items into repositories.
- Packager Plugins
Packagers are software modules that translate between DSpace Item objects
and a self-contained external representation, or “package”. A Package
Ingester interprets, or ingests, the package and creates an Item. A Package
Disseminator writes out the contents of an Item in the package format.
47
Digitisation and Digital Crosswalk Plugins
Libraries – DSpace and
GSDL Crosswalks are software modules that translate between DSpace object metadata
and a specific external representation. An Ingestion Crosswalk interprets the external
format and crosswalks it to DSpace’s internal data structure, while a Dissemination
Crosswalk does the opposite.
The Packager plugins and OAH-PMH server make use of crosswalk plugins.
- No policies
User Management
E-People and Groups are the way DSpace identifies application users for the
purpose of granting privileges. Both E-People and Groups are granted privileges
by the authorization system described below.
- E-mail address.
- Whether the user is able to log in to the system via the Web UI, and whether
they must use an X509 certificate to do so.
- A list of collections for which the e-person wishes to be notified of new items.
- Whether the e-person ‘self-registered’ with the system; that is, whether the
system created the e-person record automatically as a result of the end-user
independently registering with the system, as opposed to the e-person record
being generated from the institution’s personnel database, for example.
48
– Subscriptions Creating Digital Libraries
Using DSpace
As noted above, end-users (e-people) may ‘subscribe’ to collections in order
to be alerted when new items appear in those collections.
– Groups
Groups are another kind of entity that can be granted permissions in the
authorization system. A group is usually an explicit list of E-People; anyone
identified as one of those E-People also gains the privileges granted to the
group.
Access Control
Authentication
Authorization
Usage Metrics
DSpace is equipped with SOLR based infrastructure to log and display page
views and file downloads.
Usage statistics can be retrieved from individual item, collection and community
pages.
- System Statistics
Various statistical reports about the contents and use of your system can be
automatically generated by the system. These are generated by analyzing
DSpace’s log files.
Digital Preservation
- Checksum Checker
The purpose of the checker is to verify that the content in a DSpace repository
has not become corrupted or been tampered with.
49
Digitisation and Digital System Design
Libraries – DSpace and
GSDL
Each DSpace site is divided into communities, which can be further divided
into sub-communities reflecting the typical university structure of college,
department, research center, or laboratory.
Communities contain collections, which are groupings of related content. A
collection may appear in more than one community.
Each collection is composed of items, which are the basic archival elements of the
archive. Each item is owned by one collection. Additionally, an item may appear
in additional collections; however every item has one and only one owning collection.
Items are further subdivided into named bundles of bitstreams. Bitstreams are, as
50 the name suggests, streams of bits, usually ordinary computer files. Bitstreams that
are somehow closely related, for example HTML files and images that compose Creating Digital Libraries
Using DSpace
a single HTML document, are organized into bundles.
SRB is purely an option but may be used in lieu of the server’s file system or in
addition to the file system. Without going into a full description, SRB is a very
robust, sophisticated storage manager that offers essentially unlimited storage and
straightforward means to replicate (in simple terms, backup) the content on other
local or remote storage resources.
Pre-requisite Software
You’ll need to install this pre-requisite software (for DSpace 1.5.x and higher).
Check the “Windows Installation” section of the System Documentation for the
most recent pre-requisites, as they sometimes differ based on the version of
DSpace you are running.
- Apache Ant 1.7.x. is a Java-based build tool. Just unzip it wherever you
want it installed, and add [path-to-apache-ant]\bin to your system PATH.
51
Digitisation and Digital Self Check Exercise
Libraries – DSpace and
GSDL Note: i) Write your answers in the space given below.
ii) Check your answers with the answers given at the end of this Unit.
2) What are the prerequisite software required for DSpace?
...................................................................................................................
...................................................................................................................
...................................................................................................................
...................................................................................................................
...................................................................................................................
2) Install JDK : Double click and execute the installer file of Java that you
have downloaded. Finish JDK installation by clicking Finish. Another installer
will start automatically for installing JRE. Click next (or you may cancel it
also)> Click finish to close the installer. Next is to set up the Environmental
variables and JAVA HOME.
3) Install Apache Maven and Apache Ant : Extract the files of Apache
Maven and Apache Ant into C drive. Then give path for apache maven in
system variables: Right click my computer> properties>Advanced>
Environmental variables > Click on path and edit it > Add path “C:\apache-
maven-2.2.1 \bin” Now define path variable for apache ant in the same way
Open the extracted folder of apache ant in C drive, copy the folder path from
windows explorer address bar and paste it in system path. Click ok. This will
complete the task of defining all system paths [C:\Program
Files\Java\jdkl.6.0 _14\bin; C:\apache-maven-2.2.1 \bin; C:\apache-ant-
I.8.0\bin]. Now define ANT HOME in user variables. Variable name:
ANT HOME Variable value: C:\apache-ant-1.8.0 > Click ok and apply
the settings. All system paths and user variables are defined. We can also
check, what we have done till now. Open command prompt and run the
following command to see the java version C:/> ‘java –version’ Same way
52
you can check ‘ant –version’ and ‘mvn –version’ and the command prompt Creating Digital Libraries
Using DSpace
will show relevant information regarding the respective software. If it appears
all right then we may conclude that all packages java, maven and ant are
successfully installed and paths are appropriately defined.
4) Install Apache Tomcat : Double click on Apache tomcat installer file and
> Now, tick mark all the components in order to do full installation and
then click next. > In this window give your usemame and password, that
will give you access to monitor and control you tomcat server web interface.
Then click next. > Make sure that your java virtual machine path is
appropriate with your JRE installation folder. Click install. > Click finish…
this will start tomcat service automatically. You will see Apache icon in
Notification area of Taskbar.
6) Install DSpace : Ensure the PostgreSQL service is running, and then run
pgAdmin III (Start -> PostgreSQL 8.x -> pgAdmin III). Create the directory
for the DSpace installation (e.g. C:\DSpace).
Build DSpace in the normal fashion. From [dspace-source]\dspace run:
mvn package
ant fresh_install
C:\dspace\bin\dsrun org.dspace.administer.CreateAdministrator
• Provide Descriptive
Metadata for the collection
54
4) Creating a user and groups Creating Digital Libraries
Using DSpace
Users require accounts to be able to log in and submit or edit items. Logical
collections of users can be placed in groups to make administration easier.
DSpace has the facility User Self creation of account for which the following
steps are to be followed:
• Click on My DSpace link
• Click on ‘New user? Click here to register.’
• Enter an email address and press ‘Register’
• Follow the link in the email that is sent for verification
• Provide name, telephone number, and a password
• New users have no privileges.
Users may be combined into logical groups for managing users and assigning
privileges. Two special groups are possible on DSpace: i) Anonymous group
in which there are no users in this group. Anyone can view the content
without being logged, ii) Administrator group contains users who have full
administrator access.
Title Format
Creator Identifier
Subject Source
Description Language
Publisher Relation
Contributor Coverage
Date Rights
Type
The elements can be further refined through the use of qualifiers as shown
below in the case of the base DC element Title:
Schema = ‘dc’
Elements viz. Title / Creator / Subject / Description
Qualifiers e.g. Title.main / Title.subtitle / Title.series.
Multiple schemas can be held in the metadata registry of DSpace and the
access for which is through Administer menu -> Metadata Registry.
55
Digitisation and Digital
Libraries – DSpace and
GSDL
A schema can be edited and submitted using the ‘Update’ button, deleted
using the ‘Delete’ button next to an element and new elements can be added
using the ‘Add Metadata Field’ section at the bottom of the page
56
There are three options available for decision on the workflow: Creating Digital Libraries
Using DSpace
• Accept/Reject Step – allows a user to simply accept an item, or reject
it (with proper justification).
• Accept/Reject/Edit Metadata Step – allows a user to either accept or
reject and item, and edit its metadata.
• Edit Metadata Step- allow the user to edit the metadata. This might be
done to correct the metadata, or to improve it.
Any or all of the steps may be used. Workflow steps are worked through in
order. If step 1 and 3 are selected, step 1 must be completed before step
3 will be initiated.
For an existing collection you may create the workflow through the following
steps:
Log in as an administrator; go to the collection where you wish to create a
workflow for. Click on the button ‘Edit’ in the ‘Admin Tools’ box.
Find the ‘Submission Workflow’ section, and click on whichever step you
wish to create.
Edit the list of user and groups who can participate in the workflow as shown
below:
57
Digitisation and Digital When you have finished, press ‘Update Group’.
Libraries – DSpace and
GSDL Use the same process to edit and delete workflow in a collection.
Once an item has entered into a workflow, the concerned users and group
members will receive an email alert that there is a task awaiting attention.
When a user visits their ‘My DSpace’ page they will see any tasks in the
pool.
On clicking on ‘Take Task’ the user gets an overview of the item take a decision
whether they wish to take the task.
Clicking ‘Accept This Task’ will take the user into the workflow task page where
they have several option for action such as, Approve, Reject, Edit Metadata, Do
Later and Return Task to Pool.
7.5 SUMMARY
DSpace is a platform that allows you to capture items in any format – text, video,
audio, and data and distribute it over the web. It indexes all the collection so that
users can search and retrieve your items. It is best suited for preservation of digital
work over the long term.
In this Unit we have discussed in detail the technical features of DSpace along with
the process of installation on your system and also using it for developing digital
library.
7.7 KEYWORDS
Bitstream : a stream of data in binary form.
Checksum Checker : A checksum is a count of the number of bits in a
transmission unit that is included with the unit so that
the receiver can check to see whether the same
number of bits arrived.
OpenURL : A standardised format of Uniform Resource
Locator(URL) intended to enable Internet users to
more easily find a copy of a resource that they are
allowed to access.
59
Digitisation and Digital
Libraries – DSpace and UNIT 8 CREATING DIGITAL LIBRARIES
GSDL
USING GSDL
Structure
8.0 Objectives
8.1 Introduction
8.2 Technical Features
8.3 Installation of GSDL on Windows
8.4 Greenstone Interfaces
8.5 Collection Building In Greenstone
8.6 Summary
8.7 Answers to Self Check Exercises
8.8 Keywords
8.9 References and Further Reading
8.0 OBJECTIVES
After going through this Unit, you will be able to:
• explain the technical features of Greenstone Digital Library (GSDL) Software;
• install GSDL on your system; and
• build a digital collection for the web as well as CD-ROM for your library.
8.1 INTRODUCTION
Greenstone is an open-source, multilingual software, issued under the terms of the
GNU General Public License for building and distributing digital library collections.
The aim of the Greenstone software is to empower users, particularly in universities,
libraries, and other public service institutions, to build their own digital libraries. It
provides a new way of organizing information and publishing it on the Internet or
on CD-ROM in the form of a fully-searchable, metadata-driven digital library.
Greenstone has been produced by the New Zealand Digital Library Project at
the University of Waikato, and is now being further developed and distributed in
cooperation with UNESCO and the Human Info NGO in Belgium.
The exact user base for Greenstone is unknown. However, since it is being
distributed on SourceForge, since November 2000, it has been found that the
average downloads per month since then is around 4500.
The advantages of GSDL are:
• It is based on FOSS platform and has active community supporting it.
• It is Multi-platform application and can run on various operating system
platforms, including Windows (any version), Linux, Sun Solaris, and Mac
OSX. It is available in both binary (executable) and source code form for the
Windows (all versions), Linux, and Mac OS X operating systems and in
60 source code form for other operating systems (Unix).
• A Greenstone Collection can be served on the World Wide Web or it can Creating Digital Libraries
Using GSDL
be exported to a CD-ROM and accessed from the CD-ROM or local hard
disc without the need for Internet connectivity.
• Greenstone can build indexes from full text documents and also metadata
associated with these documents. It supports creation of indexes for various
metadata fields, either automatically extracted or manually assigned.
• It uses Perl-scripting, MG(PP) or Lucene for indexing, Apache (or built-in
webserver), XML, which are proven technologies
• Greenstone lets you build collections of multimedia documents such as audio,
video, and pictures accompanied by textual description or metadata to allow
searching and browsing.
• UNICODE compliant facilitating building, searching and browsing documents
in any Unicode-compliant language.
• Separate modules are available for different uses:
– JAVA-based interface for management
– Web-browser based access to collections
– CLI client : remote collection building
• Multi-metadata (with editor)
• Practical GLI interface for editing/managing GSDL
• Plug-ins for most document formats also available as well as for crosswalks
for ISIS, Dspace, e-mails, MARC, MARCXML.
The Unit has been adapted from the Greenstone official documentation and the
IMARK tutorial developed by FAO. Both the documents are available under the
terms of either the GNU General Public License (http://www.gnu.org/licenses/
gpl.html) and the Creative Commons Attribution License (http://
creativecommons.org/licenses/by/4.0/), for distribution and modification. The
documents used are listed in the References and Further Readings section for
further reference and you may refer them for further details.
Interoperability
It is highly interoperable, based on contemporary standards. Greenstone can harvest
documents over OAI-PMH and include them in a collection. Greenstone can
ingest documents in METS (Metadata Encoding and Transmission Standard) form.
This facilitates export and import of any collection to and from DSpace through
DSpace batch import program.
61
Digitisation and Digital Interfaces
Libraries – DSpace and
GSDL Greenstone has two separate interactive interfaces, the Reader interface and the
Librarian interface. End users access the digital library through the Reader interface,
which operates within a web browser. The Librarian interface is a Java-based
graphical user interface (also available as an applet) that makes it easy to gather
material for a collection (downloading it from the web where necessary), enrich
it by adding metadata, design the searching and browsing facilities that the collection
will offer the user, and build and serve the collection.
Metadata formats
Users define metadata interactively within the Librarian interface. Unlike DSpace
Greenstone allows several sets of metadata, including locally produced ones to be
merged. The metadata sets are predefined:
• Dublin Core (qualified and unqualified)
• RFC 1807
• NZGLS (New Zealand Government Locator Service)
• AGLS (Australian Government Locator Service)
All metadata are stored in XML-format with the documents. Metadata can also
be extracted from XML-statements within the documents It can be assigned easily
through the GSDL Librarian interface using Greenstone’s Metadata Set Editor.
“Plug-ins” are used to ingest externally-prepared metadata in different forms, and
plug-ins exist for: XML, MARC, CDS/ISIS, ProCite, BibTex, Refer, OAI, DSpace
and METS.
Document formats
Plug-ins are also used to ingest documents. For textual documents, there are plug-
ins for: PDF, PostScript, Word, RTF, HTML, Plain text, Latex, ZIP archives,
Excel, PPT, Email (various formats), source code. For multimedia documents,
there are plug-ins for: Images (any format, including GIF, JIF, JPEG, TIFF), MP3
audio, Ogg Vorbis audio, and a generic plug-in that can be configured for audio
formats, MPEG, MIDI, etc.
Languages
One of Greenstone’s unique strengths is its multilingual nature. The reader’s interface
is available in the following languages: Arabic, Armenian, Bengali, Catalan, Croatian,
Czech, Chinese (both simplified and traditional), Dutch, English, Farsi, Finnish,
French, Galician, Georgian, German, Greek, Hebrew, Hindi, Indonesian, Italian,
Japanese, Kannada, Kazakh, Kyrgyz, Latvian, Maori, Mongolian, Portuguese
(BR and PT versions), Russian, Serbian, Spanish, Thai, Turkish, Ukrainian,
Vietnamese
The Librarian interface and the full Greenstone documentation (which is extensive)
is in: English, French, Spanish, and Russian.
You will need Java to run Greenstone. You might already have itinstalled on your
system otherwise, download it from http://java.sun.com. To work with image
collections, you need ImageMagick (fromhttp://www.imagemagick.org).
Most Greenstone CD-ROMs have AutoPlay feature and start the installation
process as soon as they are inserted into the drive. If installation does not begin
by itself, locate the file setup.exe and double click it to start the installation process.
If you download Greenstone over the web then just double-click installer.
If Greenstone is already installed on your system then completely remove
the old version before installing a new one. You need not remove any pre-
packaged collections that you may have installed for this.
The following steps need to be carried out to install Greenstone:
1) Install the Java 2 Runtime Environment (latest version).
2) After installing J2RE, go for GSDL folder choose setup gsdl 2.70.
3) Choose setup Language. English (US) is the default. We choose English
4) Welcome to the InstallShield Wizard for the Greenstone Digital Library
Software. Click <Next>
5) License Agreement. Accept the agreement and then click <Next>
6) Choose location to install Greenstone. Leave at the default and click <Next>
7) Setup Type. Leave at the default (Local Library) and click <Next>
64
8) (For older installers you must now select collections. Leave at the default, Creating Digital Libraries
Using GSDL
Documented Example Collections, and click <Next>)
9) Set admin password. Choose a suitable password and click <Next> (If your
computer will not be serving collections online, the password doesn’t matter)
10) Click <Install> to complete the installation
11) Files are copied across and Installation is complete.
If you are installing from a CD-ROM, the installer will offer to install ImageMagick,
and Java, if necessary.
The remaining steps are straightforward, and, as before, it is recommend that you
use the default settings. Here is what you need to do for installing ImageMagick:
1) “This will install ImageMagick 5.5.7 Q8. Do you wish to continue?” Yes
2) “Welcome to the ImageMagick Setup Wizard” Click <Next>
3) “Information: Please read the following ...” Click <Next>
4) “Select Destination Directory ...” Leave at default and click <Next>
5) “Select Start Menu Folder ...” Leave at default and click <Next>
6) “Select Additional Tasks ...” Leave at default and click <Next>
7) “Ready to Install”. Click <Install>
8) Files are copied across
9) “You have now installed ...” Click <Next>
10) “Setup has finished ...”. Deselect “View index.html” and click <Finish>.
1) Gathering- documents into a Selecting files from ‘local file space’ or Local
Network or downloading using protocols viz. WWW, OAI (Open Archives
Initiative), Z39.50, SRW (Search and Retrieve Web service), MediaWiki.
4) ‘plug-ins’ (filters), Indexing the documents and providing preview facility for
direct access to webpage with search-interface produced by GLI is done at
this stage. Once build is successful then the collection needs to be linked to
previewing.
67
Digitisation and Digital Self Check Exercise
Libraries – DSpace and
GSDL Note: i) Write your answers in the space given below.
ii) Check your answers with the answers given at the end of this Unit.
2) What functions are available in the Librarian’s Interfce?
...................................................................................................................
...................................................................................................................
...................................................................................................................
...................................................................................................................
...................................................................................................................
...................................................................................................................
...................................................................................................................
...................................................................................................................
Collection Searching
Greenstone supports different ways of searching collections. They can be grouped
in two main categories: “plain search” (through Google-like single search box) and
“form-based search”.
• Plain search:
Simple - Users can search for words or phrases in the full text of the
document or limit the search to a specific index (e.g. document title or author)
by selecting the available index from the drop-down box.
• Form-based search
Simple - Users can search for words or phrases across different fields.
Advanced - Users can search for words or phrases across different fields,
with support for Boolean query combination, case folding and stemming.
Document Browsing
Greenstone supports browsing of documents in a collection by specific metadata
fields.
Available browse elements for a collection are shown on the navigation bar in the
collection home page. Hierarchical browsing of classification-like structures (e.g.
68 a subject classification) with different levels is possible.
Creating Digital Libraries
Using GSDL
69
Digitisation and Digital Greenstone supports multilingual interface. Through the preferences setting, the
Libraries – DSpace and
GSDL
user can change the language of the Greenstone interface. It can also support
indexing and searching of document collections in non-Latin scripts.
3) Switch to the Create panel, and build and preview the collection.
5) Back in the Librarian Interface, click the Enrich tab to view the automatically
extracted metadata. You will need to scroll down to see the extracted metadata,
which begins with “ex.”. The PostScript documents (cluster.ps and
langmodl.ps do not have extracted titles: what appears in the titles a-z list
is just the first few characters of the document).
7) Now add dc.Creator information for the same document. You can add more
than one value for the same field: when you press Enter in a metadata value
field, a new empty field of the same type will be generated.
8) Close the document when you have finished copying metadata from it. External
programs opened when viewing documents must be closed before building
the collection, otherwise errors can occur.
9) Next add title and creator metadata for a few of the other documents.
If you build and preview your collection at this point, you will find that
nothing has changed. You need to alter the collection design to use the
new Dublin Core metadata instead of the original extracted metadata.
70
10) Collection design; branding a collection with an image Creating Digital Libraries
Using GSDL
Change to the Design panel, which is split into several sections. The first
section General appears. This allows you to modify the values you provided
when defining the collection, if desired. You can also brand the collection
using a suitable image.
11) Click on the <Browse...> button associated with URL to about page icon,
and browse to the image sample_files ’! Word_and_PDF ’! wrdpdf.gif on
your computer. When you select this image, Greenstone automatically generates
an appropriate URL for the image. Preview the collection.
If you are on the web, you can easily make your own Greenstone-style icon
by going to and following the instructions there.
http://www.greenstone.org/make-images.html
Document plugins
12) Now look at the Document Plugins section, by clicking on this in the list to
the left. Here you can add, configure or remove plugins to be used in the
collection. There is no need to remove any plugins, but it will speed up
processing a little. In this case we have only Word, PDF, RTF, and PostScript
documents, and can remove the ZIPPlug, TEXTPlug, HTMLPlug, EMAILPlug,
ImagePlug, ISISPlug and NULPlug plugins. To delete a plugin, select it and
click <Remove Plugin>. GAPlug is required for any type of source collection
and should not be removed.
14) To include “form search” as well as the default “plain search”, pull down
the Search Types menu and select form; then click <Add Search Type>.
Plain search will be the default search type as it is first in the list.
Search indexes
15) The next step in the Design panel is Search Indexes. These specify what
parts of the collection are searchable (e.g. searching by title and author).
Delete the ex.Title and ex.Source indexes, which are not particularly useful,
by selecting them one at a time and clicking <Remove Index>. Only
the text index remains.
16) Now add a Title index based on dc.Title by providing an Index Name (e.g.
“Document Title”) and selecting dc.Title from the Index Source box. Then
click <Add Index>.
17) You can add indexes based on any metadata. Add an index called “Authors”
based on dc.Creator metadata.
19) Now we add an AZList classifier for dc.Title metadata. Select AZList from
the Select classifier to add drop-down list and click <Add Classifier>
21) Now add an AZCompactList classifier. Click <Add Classifier> and configure
it to use dc.Creator metadata, with button name “Creator”. Click <OK>.
The last three sections are Format Features, Translate Text and Metadata
Sets. In this exercise, we will not make any changes to these.
22) Switch to the Create panel, and build and preview the collection.
23) Check that all the facilities work properly. There should be three full-text
indexes, called text, Document Title, and Authors. In the titles a-z list should
appear all the documents to which you have assigned dc.Title metadata (and
only those documents). In the authors a--z list should appear one bookshelf
for each author you have assigned as dc.Creator, and clicking on that bookshelf
should take you to all the documents they authored.
In the similar fashion you can build up collection for other types of file formats.
For details visit the tutorial site of Greenstone.
8.6 SUMMARY
Greenstone is a freely available open source software for building and distributing
digital library collections through Internet or. Multiplatform availability, the capability
of providing access in different ways and managing different file formats, media
and languages are some of the major advantages of Greenstone. The Librarian
Interface provides the most advanced and at the same time a very user friendly
approach to collection building and also metadata management.
8.8 KEYWORDS
Lucene : Open source search engine.
Perl : A script programming language that is similar in syntax to
the C language and that includes a number of popular
UNIX facilities.
UNICODE : An international encoding standard for use with different
languages and scripts, by which each letter, digit, or symbol
is assigned a unique numeric value that applies across
different platforms and programs.
XML : Extensible Markup Language (XML) is a markup language
that defines a set of rules for encoding documents in a
format which is both human-readable and machine-
readable.
73