User Interfaces and Information Seeking in Digital Libraries: A Tutorial
Gary Marchionini University of North Carolina at Chapel Hill [email protected] ICADL 2001 Bangalore, India December 10, 2001
What is a DL?
Characteristics
electronic digital formats networked (sharable information) organization apparent (a library not a pile)
Collection development policy Systematic data structuring and tagging
use (fair) policy persistent guidance and referral community based
Motivations: technology, funding, democracy
Gary Marchionini-UNC-Chapel Hill
Digital Library Design Space
Community
Technology
Services
Content
Gary Marchionini-UNC-Chapel Hill
DL Missions
DLs clearly must aim to collect, manage, and preserve electronic expressions of knowledge (this is a wellestablished mission). Knowledge is in peoples heads--DLs should aim to facilitate the use and development of the collective knowledge in human consciousness Human attention is a fundamental natural resource--DLs should provide tools and resources (material and expertise) to help optimize this resource
Gary Marchionini-UNC-Chapel Hill
Sharium
A virtual workspace with rich content and powerful tools where people can work independently or collaborate with each other to learn and solve information problems. A collaborative problem solving environment.
Organized around resources and tools Encourages contributions and participation Is sustainable
Gary Marchionini-UNC-Chapel Hill
Sharium Workspace
The Sharium Work Space
M e s s a g i n g Search/ Discovery
Search/ Discovery
Problem Solving/ Construction Digital Library Contribution
Problem Solving/ Construction
Contribution Channels Files Tools Presentation
Presentation
Gary Marchionini-UNC-Chapel Hill
Query & Selection
Interfaces
Natural language queries Dynamic queries Alternative interfaces Help/support
Consortia/portals/channels
Interoperation Selection and merging
Gary Marchionini-UNC-Chapel Hill
Reference & Question Answering
Help people help themselves Elicitation Layered services Quality control Economic model Privacy Shared views/clients
Gary Marchionini-UNC-Chapel Hill
Cascading Assistance
Information Need Self Help Automated Help
FAQ Query clarification AnswerGardens
Community Help
Intermediary Help Expert Help
Gary Marchionini-UNC-Chapel Hill
Interactive Model of Retrieval
Gary Marchionini-UNC-Chapel Hill
Iterative Retrieval Model
Query Result set Document Stop
Agileview Interaction Model
Need OView Query RSet PView Doc Stop
Tight coupling of functions Highly interactive control mechanisms Flexible, non-linear options Result Set manipulations added Document processing tools added
Gary Marchionini-UNC-Chapel Hill
Technical View: Retrieval as
Matching Documents to Queries
Surrogates
Match Algorithm
Surrogates Query Form A Sample
Document Space
Terms
Sample
Vectors Query Form B
Query Space
Etc..
Etc..
Retrieval is algorithmic. Evaluation is typically a binary decision for each pairwise match and one or more aggregate values for a set of matches (e.g., recall and precision).
Gary Marchionini-UNC-Chapel Hill
Human View: Information-Seeking Process
Results Perceived Needs Problem Actions Queries
Indexes
Physical Interface
Data
Information seeking is an active, iterative process controlled by a human who Changes throughout the process. Evaluation is relative to human needs.
Gary Marchionini-UNC-Chapel Hill
MultiView Interaction
Integrate query and browsing Closely couple query and results Highly interactive control mechanisms (direct manipulation) Overviews and Previews Alternative interfaces (views)
Gary Marchionini-UNC-Chapel Hill
Design Strategies
Consider the information seekers context
Cognitive accessibility (it does not matter how good the results are if the information cannot be easily understood) Cost-benefit assessment (it does not matter how good results are if there is no time to use it)
Study special populations (cell biologist vs. practicing physician) Usability testing approach (iterative, impressionistic) Systematic case studies Epidemiology approach (start with outcomes and trace influences) Develop an IS interaction model
Gary Marchionini-UNC-Chapel Hill
Agile Views Framework
Gary Marchionini-UNC-Chapel Hill
Theory to Practice
Design challenge 1: creating views
What granularities (collections and items) Which attribute sets? Creating or extracting metadata
Design challenge 2: manipulating/controlling views
Perceptual estimation (e.g., look ahead) Physical and conceptual inertia
Gary Marchionini-UNC-Chapel Hill
Examples of Agile View Design Techniques
Relation Browser
Federal statistics, overviews of relationships (several different partitions). Useful for small number of attribute sets, each with small number of attribute values. Backend database of metadata, Java applet interface
Enriched Links
Complex web sites, previews, overviews, and reviews of pages. Backend computation and Javascript interface
Integrated overviews and previews
Multimedia digital library, backend computation, Java applet interface
Gary Marchionini-UNC-Chapel Hill
Relation Browser
Gary Marchionini-UNC-Chapel Hill
Enriched Links:Preview
Gary Marchionini-UNC-Chapel Hill
Enriched Links: Overview
Gary Marchionini-UNC-Chapel Hill
Enriched Links: Shared View
Gary Marchionini-UNC-Chapel Hill
Overviews and Previews: One Screen
Gary Marchionini-UNC-Chapel Hill
The Open Video Project Case
Gary Marchionini-UNC-Chapel Hill
A New Testbed for Agile Views
Open Video Project (www.open-video.org)
Research community Contributory facility
Granularities: collection of videos and collection of segments Attributes: three levels of metadata
Gary Marchionini-UNC-Chapel Hill
Overviews and Previews
Need to gain understanding of neighborhood of objects (the aboutness problem) Need to quickly understand whether an object is interesting (the relevance problem) Digital Libraries exacerbate the problems
one view fits all (screen, levels of granularity, media)
Gary Marchionini-UNC-Chapel Hill
Why Video Browsing?
Digital Libraries and Video-on-demand applications->lots of digital video As part of retrieval
embedded in larger task quick decisions about rejection
To speed basic understanding (accretion) To save time, bandwidth, money
Gary Marchionini-UNC-Chapel Hill
Key Problems
What to represent (the representation problem)
Case Dependence: The usefulness of a representation depends upon how well-suited it is to the purpose for which it is used. David Marr, 1980, Vision Implies a need for MULTIPLE LEVELS of representation
How to control the representations (the user control mechanism problem)
Gary Marchionini-UNC-Chapel Hill
Video Hierarchy
Video
Clip (segment) [conceptual/editorial; physical]
Sequence (scene) [conceptual/editorial]
Shot [camera specific] Frame [physical]
Gary Marchionini-UNC-Chapel Hill
Video Surrogates
Linguistic information
bibliographic records descriptors, extracts (e.g., transcripts, close cap) reviews
Audio information (speech, music, effects) Clips
rushes/out takes/trailers/teasers
original vs extractions
Fast Forwards (compress time) Key Frames (aka poster frame, thumbnail, storyboard)
Gary Marchionini-UNC-Chapel Hill
Key Frames
Video segmentation (chunking)
scene changes other large changes (camera, sound)
signal processing: color histogram; motion; luminosity; texture; voice; music
A frame from the change is significant (thesis sentence?) Salient stills
Gary Marchionini-UNC-Chapel Hill
Control Mechanisms
Dynamic (surrogates move, user may have some control over movement)
Slide shows
display rate image size
Fast forward
not effective for key frames
Multiple concurrent surrogates
2 feasible, 3, 4 problematic multi-video view (Bellcore) within single video
Gary Marchionini-UNC-Chapel Hill
Control Mechanisms
Static
Story Boards (aka filmstrip, v-wall, v-array)
layout array size with or without labels (e.g., words, time codes)
Hierarchical
key frame structure (Singapore)
Network
Scene Transition Graph (IBM)
Extract
Streamer (MIT) Salient Still (MIT)
Gary Marchionini-UNC-Chapel Hill
Control Mechanisms
Hybrid
image tree (Bellcore) TOC+full video skim (Informedia) story board+miniclips+sound (e.g., show frames 1-5; 50-55, 106120, etc. OR base on audio, e.g., best key words from transcript)
Gary Marchionini-UNC-Chapel Hill
Compression and Compaction
Compression (e.g., MPEG)
save space save bandwidth (transmission time)
Compaction: the HUMAN perspective
2 minute clip, show 10 key frames for 500 ms each
30 fps*120sec=3600 frames -->360:1 compression? 120 seconds:5 seconds -->24:1 compaction
We prefer the human-meaningful ratio
Gary Marchionini-UNC-Chapel Hill
Overview: Open Video
Gary Marchionini-UNC-Chapel Hill
One of many previews: Open Video
Gary Marchionini-UNC-Chapel Hill
Give People Flexibility!
Multiple views require rich and accessible metadata Control mechanisms are kludges in todays WWW environment A click is a terrible thing to waste!
Gary Marchionini-UNC-Chapel Hill
Evaluation
Gary Marchionini-UNC-Chapel Hill
Information Flow
Information Life Cycle changes?
Creationpub/review/disseminationusere generation/dispensation
Accelerate cycle rates? Add new feedback channels? (e.g., at BLS, I hypothesize that good user interfaces attract more diverse users, which in turn not only affects the publication phase but also propagates back to the creation phase, i.e., affect the survey(s)
Gary Marchionini-UNC-Chapel Hill