From  SubjectA  to  TH G enius   the Semantic Web searching 29 th  ADLUG ANNUAL MEETING 2010  Centro Congressi Panorama – Trento Provincia Autonoma di Trento 22-24 September 2010 RDF and Open Linked Data, a first approach (part II)
Library data in a modern context The  library catalogue  (as traditional catalogue or as OPAC) has been the only context for library data since its inception.  The library catalogue purpose: Identifying the library’s holding Supporting management of those holdings Providing entry and discovery points for librarians and nonlibrarians users The efforts of librarians in the creation and maintenance of the catalog is rewarded by users?  For different and various reasons, users favor the Web as an information platform over the library The question for librarians and vendors has to be:  how increase the feeling between libraries and users The question that we must face, and that we must face sooner rather than later, is  how we can best transform our data so that it can become part of the dominant information environment that is the Web
The Web as context Actual scenario:  a change is in act the web is more and more the source of information for searchers and researchers, and the library needs to be interconnected with that web of data  The library catalog data must be transformed from the actual ‘textual description’ to a set of  data elements  to which machine processes can be applied.  This  data elements  must be compatible with the current technology that is the World Wide Web This process is what we can define  the evolution from ‘library catalog’ to Semantic Web As vendor this process means  the evolution from  SubjectA  to  TH G enius
Data in the traditional catalogue  =LDR 00688nz a2200265n 4500  =001 000000008238  =005 20100519190730.0  =008 100519nn\ano\\ba\n\\\\\\\\\\\n\ana\\\\\d =040 \\$aOSZK$bhun$fKöztaurusz  =151 \\$aAsia Minor Occidentalis  =551 \\$wgnnn$aókori történeti táj  =551 \\$whnnn$aBithynia  =551 \\$whnnn$aCaria  =551 \\$whnnn$aIonia  =551 \\$whnnn$aLycaonia  =551 \\$whnnn$aLycia  =551 \\$whnnn$aLydia  =551 \\$whnnn$aMysia  =551 \\$whnnn$aPamphylia  =551 \\$whnnn$aPhrygia  =551 \\$whnnn$aPisidia  =551 \\$wjnnn$aAsia Minor  =551 \\$wpnnn$aTörökország  =551 \\$wmnnn$aAsia Minor Orientalis  =751 \4$a(392)  The ‘Asia Minor Occidentalis’ as  MARC21 authority record
The knowledge base for Web <skos:Concept rdf:about=&quot;http://nektar.oszk.hu/resource/auth/Asia_Minor_Occidentalis&quot;> <skos:inScheme rdf:resource=&quot;http://www.oszk.hu/thesaurus/location&quot;/> <dc:source>OSZK geotezaurusz</dc:source> <dc:type>location</dc:type> <skos:prefLabel xml:lang=&quot;hu&quot;>Asia Minor Occidentalis</skos:prefLabel> <skos:broader rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/ókori_történeti_táj&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Bithynia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Caria&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Ionia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Lycaonia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Lycia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Lydia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Mysia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Pamphylia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Phrygia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Pisidia&quot;/> <skos:broader rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Asia_Minor&quot;/> <skos:related rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Törökország&quot;/> </skos:Concept> The ‘Asia Minor Occidentalis’ as web resource (in RDF/SKOS format )
The Web as context What we can manage now with THGenius (RDF Resource Description Framework) RDF/SKOS objects: Simple Knowledge Organization System (to rapresentation of thesauri, classification schemes, taxonomies, subject-headings systems and so on) RDF/FOAF objects: acronym of  Friends of friends  (ontology describing persons, their activities and their relations with other people and objects) RDF/DC object: acronym for RDF Dublin Core metadata (used to describe information resources, such as documents) To obtain the common goal: to publish on the web our data as linked entities
Library data in a modern context http:// semanticweb.org / wiki / SPARQL_endpoint “ A SPARQL endpoint enables users (human or other) to query a knowledge base via the SPARQL language. Results are typically returned in one or more machine-processable formats. Therefore, a SPARQL endpoint is mostly conceived as a machine-friendly interface towards a knowledge base” “ Both the formulation of the queries and the human-readable presentation of the results should typically be implemented by the calling software, and not be done manually by human users” Our proposal:  TH G enius
TH G enius: the SPARQL endpoint The ‘Asia Minor Occidentalis’ in ThGenius (that reads the SKOS concept)
TH G enius: search the Semantic Web
TH G enius: search the Semantic Web
The open search in  TH G enius
TH G enius: different perspectives to see concepts
TH G enius: different perspectives to see concepts
TH G enius: also a new Thesaurus management system WeCat: a traditional way to manage Thesauri
TH G enius: also a new Thesaurus management system WeCat: a traditional way to manage Thesauri
TH G enius: also a new Thesaurus management system WeCat: a traditional way to manage Thesauri
TH G enius: also a new Thesaurus management system WeCat: a traditional way to manage Thesauri
TH G enius: also a new Thesaurus management system ThGenius: authorised people to manage Thesaurus via web
TH G enius: also a new Thesaurus management system ThGenius: authorised people to manage Thesaurus via web
Keyword and Keyphrases Indexing (1/9)  Keywords and keyphrases summarize and describe the content of single documents and provide additional semantic metadata that is useful for a lot of purposes.  The task of assigning keywords and keyphrases to a document is called  keyphrase / keyword indexing. In libraries, professional indexers select keyphrases and keywords from a controlled vocabulary (Subject Headings) according to defined cataloguing rules. The idea behind the process described in the next slides is to automatize the indexing task in order to automatically add to our documents a set of keywords / keyphrases extracted using semantic relationships within a thesaurus (expressed in SKOS format). This is another interesting advantage of having a thesaurus in SKOS format.
Keyword and Keyphrases Indexing (2/9)  For our example we will use the following set of documents: Format File Description Circulation.doc Amicus Circulation Module user manual Dubliners.pdf The Dubliners (by J.Joyce) Harry_Potter.pdf Harry Potter and the Quest of Values (a thesis) bondvaluation.xls Bond Calc Spreadsheet Moby-Dick.pdf Moby Dick (by H.Melville) Searching.odt Amicus Search Module user manual WeLoan.ppt Amicus Circulation Web Module (by T.Possemato)
Keyword and Keyphrases Indexing (3/9)  First of all we will extract metadata from our documents. Specifically we will get the “title” and “author” metadata attributes. The following is what the process produces: Metadata attribute : title=Bond Calculator Metadata attribute : author=Robert Jones Metadata attribute : title=Amicus Circulation Module - User Manual Metadata attribute : author=Anneke Metadata attribute : title= Dubliners Metadata attribute : author=James Joyce Metadata attribute : title=Harry Potter and the Quest for Values Metadata attribute : author=Tony Lennard Metadata attribute : title=Moby Dick  Metadata attribute : author=Herman Melville Metadata attribute : title=WeLoan - The new circulation module Metadata attribute : author=Tiziana Possemato ...
Keyword and Keyphrases Indexing (4/9)  As a second step, we will proceed with text extraction. Regardless the file format, we will extract the textual content from each document.  Together with the previously extracted metadata, this is an important  part for keyword indexing because later, using the extracted text, the system will be able to undestand terms occurrency, frequency and relevance within the documents. Keep in mind that the file format is not important from this point of view. That means you can use doc, txt, pdf, rtf, xml, html, open office documents and generally speaking, all formats that have a (direct or indirect) textual content.
Keyword and Keyphrases Indexing (5/9)  In order to give you an example, the following is a section of the text extracted from “Amicus Circulation Module user manual” (a Microsoft Word document) ... If the item is received in the requesting library, it needs to be checked in to make the item  available for circulation. To do this, one has to follow the steps given below: Click on the Check In button on the Circulation Main Menu. Enter the barcode of the copy and press enter. A message appears that the item has arrived  from transit. Click on the close button. If one checks the status of the copy in the requesting library, one will find an additional field on  the status of copy screen: “Original branch”, showing the owning branch of the transferred item. After the check in of the item, the copy can be charged out to the borrower who requested the  copy. The policies of the requesting library are valid as policies for this book. A hold can be placed on an item of another library from the moment the book has been  transferred by the owning library. See: charge out policies Note that: The item can be charged out immediately when a borrower is present in the library  at that moment. In this case, it is not necessary to check in the copy first before doing a charge  out. To return the copy There are two options to return a transferred copy to the owning library.  The first option is when the borrower comes to check in the copy: Enter the Barcode number of the copy on the Check In screen and press enter. ...
Keyword and Keyphrases Indexing (6/9)  After extracting the textual content from our documents now it's time to extract  Keywords and keyphrases. In order to do that we need: Metadata attributes: see first step; Text: see second step; A controller vocabulary (thesaurus) in SKOS format; Regarding the last point, for this example, we will use the Library of Congress Subject Headings (LCSH) but keep in mind that any Thesaurus in SKOS format can be used.
Keyword and Keyphrases Indexing (7/9)  The following are keyphrases and keywords extracted from Moby Dick by Herman Melville using two different thesaurus (Library of Congress Subject Headings and Medical Subject Headings). Soils Whaling Whales Hand Boats and boating History Ships Steam engines Steam engineers Poultry Seas Journalism History Emotions Interest (Psychology) Fat Steam engineering Steam-engines ... Dogs Male Smell Simian Acquired Immunodeficiency Syndrome Female Spermatozoa Animals Animation [Publication Type] Sleep Leg Cattle Mouth Monsters Aged Aging Mortality DNA Transposable Elements Brain ... LCSH MESH
Keyword and Keyphrases Indexing (8/9)  And finally, after indexing metadata, text, keyword and keyphrases we can search those documents using our favourite search engine.
Keyword and Keyphrases Indexing (9/9)
What  TH G enius  is: The best opportunity for a library to be attractive for modern and smart users The evolution from traditional library catalog to  semantic web :  not only from a vendor but also from a library view point A very powerful and userfriendly way to produce, use and share library data, available for web A simple way to ‘manage’ thesaurus and authority data in a very standard and reusable format A powerful and simply way to improve the search functions, increasing fulltext and other different file metadata TH G enius in few concepts

More Related Content

PDF
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
PDF
Introduction to RDF & SPARQL
PPTX
Linked Data MLA 2015
PPTX
Linked data MLA 2015
PPTX
Creating Linked Data from Relational Databases
PPTX
Linked data HHS 2015
ODP
Quick Introduction to the Semantic Web, RDFa & Microformats
PPTX
Introduction to Linked Data Platform (LDP)
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
Introduction to RDF & SPARQL
Linked Data MLA 2015
Linked data MLA 2015
Creating Linked Data from Relational Databases
Linked data HHS 2015
Quick Introduction to the Semantic Web, RDFa & Microformats
Introduction to Linked Data Platform (LDP)

What's hot (20)

PPTX
Linked dataresearch
PPTX
Linked Data and Locah, UKSG2011
PDF
Linked Data Tutorial
PPT
Fox-Keynote-Now and Now of Data Publishing-nfdp13
KEY
Semantic Web and Linked Open Data
PPTX
Open library data and embrace the world library linked data
PPT
From federated to aggregated search
PPTX
Omitola w3 c_govtlinkeddata
PPTX
Describing LDP Applications with the Hydra Core Vocabulary
PPTX
Introduction to W3C Linked Data Platform
ODP
Journalism and the Semantic Web
PPT
Citation Analysis for the Free, Online Literature
PPTX
Usage of Linked Data: Introduction and Application Scenarios
PDF
WWW2014 Overview of W3C Linked Data Platform 20140410
PPT
Do the LOCAH-Motion: How to Make Bibliographic and Archival Linked Data
PPTX
DTL Partners Event - FAIR Data Tech overview - Day 1
PPTX
Semantics 101
PPT
RESTful Services
PPTX
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
PDF
Management of bibliographic metadata - Metadata management at the Leibniz Inf...
Linked dataresearch
Linked Data and Locah, UKSG2011
Linked Data Tutorial
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Semantic Web and Linked Open Data
Open library data and embrace the world library linked data
From federated to aggregated search
Omitola w3 c_govtlinkeddata
Describing LDP Applications with the Hydra Core Vocabulary
Introduction to W3C Linked Data Platform
Journalism and the Semantic Web
Citation Analysis for the Free, Online Literature
Usage of Linked Data: Introduction and Application Scenarios
WWW2014 Overview of W3C Linked Data Platform 20140410
Do the LOCAH-Motion: How to Make Bibliographic and Archival Linked Data
DTL Partners Event - FAIR Data Tech overview - Day 1
Semantics 101
RESTful Services
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
Management of bibliographic metadata - Metadata management at the Leibniz Inf...
Ad

Similar to THGenius, rdf and open linked data for thesaurus management (20)

PPTX
Digital Library Applications Of Social Networking
PPTX
Digital Library Applications Of Social Networking Jeju Intl Conference
PPT
Pratt Sils LIS653 4 Fall 2007
PPT
Porting Library Vocabularies to the Semantic Web - IFLA 2010
PPT
Ontology Poster
PPTX
Applications of xml, semantic web or linked data in Library/Information Servi...
PDF
Role of Ontologies in Semantic Digital Libraries
PPT
Relevance of clasification and indexing
PPT
Tutorial on Semantic Digital Libraries (WWW'2007)
PDF
Cataloging101 foundations: Authorities
PDF
BIBFRAME, Linked data, RDA
PDF
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
PPS
Open Source ILS Add-Ons
PPTX
How Libraries Use Publisher Metadata Redux (Steven Shadle)
PPTX
Controlled Vocabularies & Cataloging
PDF
The Semantic Web in Digital Libraries: A Literature Review
PPT
Rdf and open linked data a first approach
PPT
The Semantic Web
PPTX
Adlug annual meeting 2013
PPT
What is Linked Data, and What Does It Mean for Libraries?
Digital Library Applications Of Social Networking
Digital Library Applications Of Social Networking Jeju Intl Conference
Pratt Sils LIS653 4 Fall 2007
Porting Library Vocabularies to the Semantic Web - IFLA 2010
Ontology Poster
Applications of xml, semantic web or linked data in Library/Information Servi...
Role of Ontologies in Semantic Digital Libraries
Relevance of clasification and indexing
Tutorial on Semantic Digital Libraries (WWW'2007)
Cataloging101 foundations: Authorities
BIBFRAME, Linked data, RDA
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
Open Source ILS Add-Ons
How Libraries Use Publisher Metadata Redux (Steven Shadle)
Controlled Vocabularies & Cataloging
The Semantic Web in Digital Libraries: A Literature Review
Rdf and open linked data a first approach
The Semantic Web
Adlug annual meeting 2013
What is Linked Data, and What Does It Mean for Libraries?
Ad

More from @CULT Srl (20)

PDF
Share catalogue
PDF
35st adlug oli suite_new_ils
PDF
ALIADA
PDF
SHARE Catalogue
PDF
BIBFRAME LINKED DATA
PDF
Stelline2016_Presentazione del Progetto SHARE Catalogue_pdf
PPTX
34th adlug ilike
PPTX
34th adlug oli_suite_new_ils
PPTX
Automatic publication under LInked dAta paradigm of library DAta (ALIADA)
ODP
Practical Linked Data: risorse, strumenti, utilizzi
PDF
RDA e Linked data: un binomio naturale
PPTX
Presentazione Progetto Europeo Aliada - LuBeC 2014
PPTX
Adlug annual meeting 2014
PDF
Brochure: Linked Open Data PILLS
PPT
Open Library Innovation - OliSuite
PPT
@Cult corporate identity
PPT
Odissea open data per la pa
PPT
OseeGenius - Semantic search engine and discovery platform
PDF
Enterprise Social Search
PPT
AquaBrowser in 5 minuti
Share catalogue
35st adlug oli suite_new_ils
ALIADA
SHARE Catalogue
BIBFRAME LINKED DATA
Stelline2016_Presentazione del Progetto SHARE Catalogue_pdf
34th adlug ilike
34th adlug oli_suite_new_ils
Automatic publication under LInked dAta paradigm of library DAta (ALIADA)
Practical Linked Data: risorse, strumenti, utilizzi
RDA e Linked data: un binomio naturale
Presentazione Progetto Europeo Aliada - LuBeC 2014
Adlug annual meeting 2014
Brochure: Linked Open Data PILLS
Open Library Innovation - OliSuite
@Cult corporate identity
Odissea open data per la pa
OseeGenius - Semantic search engine and discovery platform
Enterprise Social Search
AquaBrowser in 5 minuti

Recently uploaded (20)

PDF
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
PPTX
Microsoft User Copilot Training Slide Deck
PDF
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
PDF
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
PDF
Data Virtualization in Action: Scaling APIs and Apps with FME
PDF
giants, standing on the shoulders of - by Daniel Stenberg
PDF
Rapid Prototyping: A lecture on prototyping techniques for interface design
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
PDF
Lung cancer patients survival prediction using outlier detection and optimize...
PDF
Comparative analysis of machine learning models for fake news detection in so...
PDF
Flame analysis and combustion estimation using large language and vision assi...
PPTX
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PPTX
Training Program for knowledge in solar cell and solar industry
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
Microsoft User Copilot Training Slide Deck
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
Custom Battery Pack Design Considerations for Performance and Safety
Convolutional neural network based encoder-decoder for efficient real-time ob...
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
Data Virtualization in Action: Scaling APIs and Apps with FME
giants, standing on the shoulders of - by Daniel Stenberg
Rapid Prototyping: A lecture on prototyping techniques for interface design
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
sbt 2.0: go big (Scala Days 2025 edition)
Improvisation in detection of pomegranate leaf disease using transfer learni...
Lung cancer patients survival prediction using outlier detection and optimize...
Comparative analysis of machine learning models for fake news detection in so...
Flame analysis and combustion estimation using large language and vision assi...
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
Training Program for knowledge in solar cell and solar industry
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx

THGenius, rdf and open linked data for thesaurus management

  • 1. From SubjectA to TH G enius the Semantic Web searching 29 th ADLUG ANNUAL MEETING 2010 Centro Congressi Panorama – Trento Provincia Autonoma di Trento 22-24 September 2010 RDF and Open Linked Data, a first approach (part II)
  • 2. Library data in a modern context The library catalogue (as traditional catalogue or as OPAC) has been the only context for library data since its inception. The library catalogue purpose: Identifying the library’s holding Supporting management of those holdings Providing entry and discovery points for librarians and nonlibrarians users The efforts of librarians in the creation and maintenance of the catalog is rewarded by users? For different and various reasons, users favor the Web as an information platform over the library The question for librarians and vendors has to be: how increase the feeling between libraries and users The question that we must face, and that we must face sooner rather than later, is how we can best transform our data so that it can become part of the dominant information environment that is the Web
  • 3. The Web as context Actual scenario: a change is in act the web is more and more the source of information for searchers and researchers, and the library needs to be interconnected with that web of data The library catalog data must be transformed from the actual ‘textual description’ to a set of data elements to which machine processes can be applied. This data elements must be compatible with the current technology that is the World Wide Web This process is what we can define the evolution from ‘library catalog’ to Semantic Web As vendor this process means the evolution from SubjectA to TH G enius
  • 4. Data in the traditional catalogue =LDR 00688nz a2200265n 4500 =001 000000008238 =005 20100519190730.0 =008 100519nn\ano\\ba\n\\\\\\\\\\\n\ana\\\\\d =040 \\$aOSZK$bhun$fKöztaurusz =151 \\$aAsia Minor Occidentalis =551 \\$wgnnn$aókori történeti táj =551 \\$whnnn$aBithynia =551 \\$whnnn$aCaria =551 \\$whnnn$aIonia =551 \\$whnnn$aLycaonia =551 \\$whnnn$aLycia =551 \\$whnnn$aLydia =551 \\$whnnn$aMysia =551 \\$whnnn$aPamphylia =551 \\$whnnn$aPhrygia =551 \\$whnnn$aPisidia =551 \\$wjnnn$aAsia Minor =551 \\$wpnnn$aTörökország =551 \\$wmnnn$aAsia Minor Orientalis =751 \4$a(392) The ‘Asia Minor Occidentalis’ as MARC21 authority record
  • 5. The knowledge base for Web <skos:Concept rdf:about=&quot;http://nektar.oszk.hu/resource/auth/Asia_Minor_Occidentalis&quot;> <skos:inScheme rdf:resource=&quot;http://www.oszk.hu/thesaurus/location&quot;/> <dc:source>OSZK geotezaurusz</dc:source> <dc:type>location</dc:type> <skos:prefLabel xml:lang=&quot;hu&quot;>Asia Minor Occidentalis</skos:prefLabel> <skos:broader rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/ókori_történeti_táj&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Bithynia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Caria&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Ionia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Lycaonia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Lycia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Lydia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Mysia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Pamphylia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Phrygia&quot;/> <skos:narrower rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Pisidia&quot;/> <skos:broader rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Asia_Minor&quot;/> <skos:related rdf:resource=&quot;http://nektar.oszk.hu/resource/auth/Törökország&quot;/> </skos:Concept> The ‘Asia Minor Occidentalis’ as web resource (in RDF/SKOS format )
  • 6. The Web as context What we can manage now with THGenius (RDF Resource Description Framework) RDF/SKOS objects: Simple Knowledge Organization System (to rapresentation of thesauri, classification schemes, taxonomies, subject-headings systems and so on) RDF/FOAF objects: acronym of Friends of friends (ontology describing persons, their activities and their relations with other people and objects) RDF/DC object: acronym for RDF Dublin Core metadata (used to describe information resources, such as documents) To obtain the common goal: to publish on the web our data as linked entities
  • 7. Library data in a modern context http:// semanticweb.org / wiki / SPARQL_endpoint “ A SPARQL endpoint enables users (human or other) to query a knowledge base via the SPARQL language. Results are typically returned in one or more machine-processable formats. Therefore, a SPARQL endpoint is mostly conceived as a machine-friendly interface towards a knowledge base” “ Both the formulation of the queries and the human-readable presentation of the results should typically be implemented by the calling software, and not be done manually by human users” Our proposal: TH G enius
  • 8. TH G enius: the SPARQL endpoint The ‘Asia Minor Occidentalis’ in ThGenius (that reads the SKOS concept)
  • 9. TH G enius: search the Semantic Web
  • 10. TH G enius: search the Semantic Web
  • 11. The open search in TH G enius
  • 12. TH G enius: different perspectives to see concepts
  • 13. TH G enius: different perspectives to see concepts
  • 14. TH G enius: also a new Thesaurus management system WeCat: a traditional way to manage Thesauri
  • 15. TH G enius: also a new Thesaurus management system WeCat: a traditional way to manage Thesauri
  • 16. TH G enius: also a new Thesaurus management system WeCat: a traditional way to manage Thesauri
  • 17. TH G enius: also a new Thesaurus management system WeCat: a traditional way to manage Thesauri
  • 18. TH G enius: also a new Thesaurus management system ThGenius: authorised people to manage Thesaurus via web
  • 19. TH G enius: also a new Thesaurus management system ThGenius: authorised people to manage Thesaurus via web
  • 20. Keyword and Keyphrases Indexing (1/9) Keywords and keyphrases summarize and describe the content of single documents and provide additional semantic metadata that is useful for a lot of purposes. The task of assigning keywords and keyphrases to a document is called keyphrase / keyword indexing. In libraries, professional indexers select keyphrases and keywords from a controlled vocabulary (Subject Headings) according to defined cataloguing rules. The idea behind the process described in the next slides is to automatize the indexing task in order to automatically add to our documents a set of keywords / keyphrases extracted using semantic relationships within a thesaurus (expressed in SKOS format). This is another interesting advantage of having a thesaurus in SKOS format.
  • 21. Keyword and Keyphrases Indexing (2/9) For our example we will use the following set of documents: Format File Description Circulation.doc Amicus Circulation Module user manual Dubliners.pdf The Dubliners (by J.Joyce) Harry_Potter.pdf Harry Potter and the Quest of Values (a thesis) bondvaluation.xls Bond Calc Spreadsheet Moby-Dick.pdf Moby Dick (by H.Melville) Searching.odt Amicus Search Module user manual WeLoan.ppt Amicus Circulation Web Module (by T.Possemato)
  • 22. Keyword and Keyphrases Indexing (3/9) First of all we will extract metadata from our documents. Specifically we will get the “title” and “author” metadata attributes. The following is what the process produces: Metadata attribute : title=Bond Calculator Metadata attribute : author=Robert Jones Metadata attribute : title=Amicus Circulation Module - User Manual Metadata attribute : author=Anneke Metadata attribute : title= Dubliners Metadata attribute : author=James Joyce Metadata attribute : title=Harry Potter and the Quest for Values Metadata attribute : author=Tony Lennard Metadata attribute : title=Moby Dick Metadata attribute : author=Herman Melville Metadata attribute : title=WeLoan - The new circulation module Metadata attribute : author=Tiziana Possemato ...
  • 23. Keyword and Keyphrases Indexing (4/9) As a second step, we will proceed with text extraction. Regardless the file format, we will extract the textual content from each document. Together with the previously extracted metadata, this is an important part for keyword indexing because later, using the extracted text, the system will be able to undestand terms occurrency, frequency and relevance within the documents. Keep in mind that the file format is not important from this point of view. That means you can use doc, txt, pdf, rtf, xml, html, open office documents and generally speaking, all formats that have a (direct or indirect) textual content.
  • 24. Keyword and Keyphrases Indexing (5/9) In order to give you an example, the following is a section of the text extracted from “Amicus Circulation Module user manual” (a Microsoft Word document) ... If the item is received in the requesting library, it needs to be checked in to make the item available for circulation. To do this, one has to follow the steps given below: Click on the Check In button on the Circulation Main Menu. Enter the barcode of the copy and press enter. A message appears that the item has arrived from transit. Click on the close button. If one checks the status of the copy in the requesting library, one will find an additional field on the status of copy screen: “Original branch”, showing the owning branch of the transferred item. After the check in of the item, the copy can be charged out to the borrower who requested the copy. The policies of the requesting library are valid as policies for this book. A hold can be placed on an item of another library from the moment the book has been transferred by the owning library. See: charge out policies Note that: The item can be charged out immediately when a borrower is present in the library at that moment. In this case, it is not necessary to check in the copy first before doing a charge out. To return the copy There are two options to return a transferred copy to the owning library. The first option is when the borrower comes to check in the copy: Enter the Barcode number of the copy on the Check In screen and press enter. ...
  • 25. Keyword and Keyphrases Indexing (6/9) After extracting the textual content from our documents now it's time to extract Keywords and keyphrases. In order to do that we need: Metadata attributes: see first step; Text: see second step; A controller vocabulary (thesaurus) in SKOS format; Regarding the last point, for this example, we will use the Library of Congress Subject Headings (LCSH) but keep in mind that any Thesaurus in SKOS format can be used.
  • 26. Keyword and Keyphrases Indexing (7/9) The following are keyphrases and keywords extracted from Moby Dick by Herman Melville using two different thesaurus (Library of Congress Subject Headings and Medical Subject Headings). Soils Whaling Whales Hand Boats and boating History Ships Steam engines Steam engineers Poultry Seas Journalism History Emotions Interest (Psychology) Fat Steam engineering Steam-engines ... Dogs Male Smell Simian Acquired Immunodeficiency Syndrome Female Spermatozoa Animals Animation [Publication Type] Sleep Leg Cattle Mouth Monsters Aged Aging Mortality DNA Transposable Elements Brain ... LCSH MESH
  • 27. Keyword and Keyphrases Indexing (8/9) And finally, after indexing metadata, text, keyword and keyphrases we can search those documents using our favourite search engine.
  • 28. Keyword and Keyphrases Indexing (9/9)
  • 29. What TH G enius is: The best opportunity for a library to be attractive for modern and smart users The evolution from traditional library catalog to semantic web : not only from a vendor but also from a library view point A very powerful and userfriendly way to produce, use and share library data, available for web A simple way to ‘manage’ thesaurus and authority data in a very standard and reusable format A powerful and simply way to improve the search functions, increasing fulltext and other different file metadata TH G enius in few concepts