What is Metadata
Lesson 7: Metadata
CCimagebybonusonFlickr
What is Metadata
• Explanation of metadata
• Illustrate the value and utility of metadata to data users,
data providers, and organizations
• Examine information included in a metadata record
• Examples of metadata standards and how to choose
• Preparing to write metadata
• Tips for writing a quality metadata record
CCimagebyAlecCouros
onFlickr
What is Metadata
After completing this lesson, the participant will be able to:
• Identify and list the types of information typically included
in metadata records for environmental datasets
• Identify 3 reasons metadata is of value to data users, data
developers, and organizations
• List 3 uses for metadata, beyond discovery of data
• Identify and describe factors that may determine which
metadata standards are most appropriate for a given
dataset
• List steps to prepare to write metadata
• Explain how to write good metadata
What is Metadata
Plan
Collect
Assure
Describe
Preserve
Discover
Integrate
Analyze
What is Metadata
Metadata is: Data ‘reporting’
• WHO created the data?
• WHAT is the content of the data?
• WHEN were the data created?
• WHERE is it geographically?
• HOW were the data developed?
• WHY were the data developed?
PhotobyMichelleChang.AllRightsReserved
What is Metadata
• Metadata is all around…
Author(s) Boullosa, Carmen.
Title(s) They're cows, we're pigs /
by Carmen Boullosa
Place New York : Grove Press, 1997.
Physical Descr viii, 180 p ; 22 cm.
Subject(s) Pirates Caribbean Area Fiction.
Format Fiction
CCimagebyUSDAgovonFlickr
CCimagebyMskaduonFlickr
What is Metadata
· Data Discovery
Metadata:
captures
information
USGS
Science Data
Catalog:
enabling
discovery
DataONE:
enables
exchange
What is Metadata
InformationContent
Time
· Scientific Understanding and Reuse
Time of data development
Accident
Retirement or
career change
Death
(modified from Michener et al. 1997)
Specific details
General details
What is Metadata
· Defending policy decisions based on data
• Regulatory decisions based on undocumented data are not
defensible
• Metadata accuracy and details are important as supporting
evidence for the science and policy
Controversies arise when metadata are incomplete and/or
absent
What is Metadata
Data
users
Organizations
Metadata
helps…
What is Metadata
Metadata allows data developers to:
• Avoid data duplication
• Share reliable information
• Publicize efforts – promote the work of a scientist and
his/her contributions to a field of study
• Metadata reuse saves time and resources in the long-run
CCimagebyUSEmbassyGuyanaonFlickr
What is Metadata
Metadata gives a user the ability to:
• Search, retrieve, and evaluate dataset
information from both inside and outside
an organization
• Find data: Determine what data exists for
a geographic location and/or topic
• Determine applicability: Decide if a
dataset meets a particular need
• Discover how to acquire the dataset
identified; process and use the dataset
• Understand the dataset, including
definitions of column names, or expected
numerical ranges found in the data
CCimagebyASEEonFlickr
What is Metadata
• Metadata helps ensure an organization’s investment
in data:
o Documentation of data processing steps, quality
control, definitions, data uses, and restrictions
o Ability to use data after initial intended purpose
o Allows organization to track data use and
facilitates publication
• Transcends people and time:
o Offers data permanence
o Creates institutional memory
• Advertises an organization’s research:
o Creates possible new partnerships and
collaborations through data sharing
CCimagebymambolonFlickr
What is Metadata
Metadata can support:
data distribution
data management
What is Metadata
The descriptive content of the metadata file can be used to
identify, assess, and access available data resources.
• online access
• order process
• contacts
ACCESS
• use constraints
• access constraints
• data quality
• availability/pricing
ASSESS
• keywords
• geographic location
• time period
• attributes
IDENTIFY
What is Metadata
Examples of metadata search catalogs:
o DataONE
• Data discovery, knowledge, community…for a sustainable future
• https://search.dataone.org
o Data.gov
• Federal e-gov geospatial data portal
• http://www.geo.data.gov
o Metacat
• Repository for data and metadata
• http://knb.ecoinformatics.org/index.jsp
o US Geological Survey
• USGS Science Data Catalog
• http://data.usgs.gov/datacatalog
o ArcGIS Online
• ESRI sponsored national geospatial data portal
• http://www.geographynetwork.com
CCimagebyRGB12onFlickr
What is Metadata
What is Metadata
• Metadata records can be used to track data provenance
accurately
• Data Maintenance:
o Are the data current?
o Are the data in a reliable format?
o Where are the data stored?
• Data Update:
o Contact information
o Distribution policies, availability, pricing, URLs
o New derivations of the dataset
What is Metadata
• Metadata allows you to repeat a scientific process if:
o methodologies are defined
o variables are defined
o analytical parameters are defined
• Metadata allows you to defend your
scientific process:
o demonstrate process
o increasingly data savvy public
requires metadata for consumer information
INPUT
RESULTS
What is Metadata
Metadata is a declaration of:
• Purpose – the originator’s intended
application of the data
• Use Constraints - inappropriate applications
of the data
• Completeness - features or geographies
excluded from the data
• Distribution Liability - explicit liability of the
data producer and assumed liability of the
consumer
What to
do…
What not to
do…
What is Metadata
Even if the value of data documentation is recognized,
researchers are often concerned about the effort required to
create metadata that effectively describe their data.
CCimagebywaterlilysageonFlickr
What is Metadata
Concern Solution
workload required to capture
accurate robust metadata
incorporate metadata creation
into data development process –
distribute the effort
time and resources to create,
manage, and maintain metadata
include in grant budget and
schedule
readability / usability of metadata
use a standardized metadata
format
discipline specific information
and ontologies
Use a standard ‘profile’ that
supports discipline specific
information
What is Metadata
• A Standard provides a structure to describe data with:
o Common terms to allow consistency between records
o Common definitions for easier interpretation
o Common language for ease of communication
o Common structure to quickly locate information
• In search and retrieval, standards provide:
o Documentation structure in a reliable and predictable
format for computer interpretation
o A uniform summary description of the dataset
CCimageby
ccarlsteadonFlickr
What is Metadata
Components of metadata:
• A metadata standard is
made up of defined
elements, including the type
of information the user
should enter (e.g. text,
numbers, date).
• Examples of elements
include Title, Abstract,
Keyword, Online Link
What is Metadata
CCimagebyIlikeonFlickr
What is Metadata
ImagecourtesyofVivHutchinson
What is Metadata
• Dublin Core Element Set
o Emphasis on web resources, publications
o http://dublincore.org/documents/dces/
• FGDC Content Standard for Digital Geospatial Metadata
(CSDGM)
o Emphasis on geospatial data
o The Biological Data Profile (BDP) of the CSDGM is a profile
to the CSDGM with an emphasis on biological data (and
geospatial)
ohttps://www.fgdc.gov/metadata/csdgm-standard
• ISO 19115/19139 Geographic information – metadata
o Emphasis on geospatial data and services
o https://www.fgdc.gov/metadata/iso-standards
What is Metadata
• Ecological Metadata Language (EML)
o Focus on ecological data
o http://knb.ecoinformatics.org/eml_metadata_guide.html
• Darwin Core
oEmphasis on museum specimens
o http://rs.tdwg.org/dwc/index.htm
• Geography Markup Language (GML)
o Emphasis on geographic features (roads, highways,
bridges)
o http://www.opengeospatial.org/standards/gml
What is Metadata
Ecological Metadata Language (EML) FGDC Content Standard for Digital
Geospatial Metadata
Title Title
Abstract Abstract
Entity Description Entity Type Definition
Intellectual Rights Use Constraints
Terminology for the same concepts may vary across standards
What is Metadata
• Many standards collect similar information
• Factors to consider:
o Your data type:
• Are you working mainly with GIS data? Raster/vector or point data?
Do you have biological or shoreline information in your dataset?
- Consider the FGDC Content Standard for Digital Geospatial
Metadata with one of its profiles: the Biological Data Profile or the
Shoreline Data Profile.
• Are you working with data retrieved from instruments such as
monitoring stations or satellites? Are you using geospatial data
services such as applications for web-mapping applications or data
modeling?
- If so, then consider using the ISO 19115-2 standard
• Are you mainly working with ecological data?
- Consider Ecological Metadata Language (EML)
What is Metadata
• More Factors to consider:
o Your organization’s policies: do they state which standard to use?
o What resources are available to create metadata?
Examples of Tools:
• FGDC CSDGM:
- https://www.fgdc.gov/metadata/geospatial-metadata-
tools#availabletools
• EML:
- Morpho (http://knb.ecoinformatics.org/morphoportal.jsp)
• ISO: (http://www.fgdc.gov/metadata/iso-metadata-editor-review)
- XML Spy or Oxygen
- CatMD
o Other factors: Availability of human support; instructional materials; use of
controlled vocabularies; output formats
What is Metadata
Metadata are developed continuously throughout the
entire data lifecycle
Plan
Collect
Assure
Describe
Preserve
Discover
Integrate
Analyze
What is Metadata
Consistency with commonly used fields
<publish>U.S. Geological
Survey</publish>
<publish>USGS</publish>
✔ ✗Publisher:
<pubdate>YYYYMMDD</pubdate>
<pubdate>YYYY</pubdate>
<pubdate>MM/DD/YYYY</pubdate>
<pubdate>May 27, 2003</pubdate>
Date:
Examples for a FGDC CSDGM record:
<placekt>Geographic Names Information
System</placekt>
<placekey>Roosevelt National
Forest</placekey>
<themekey>Roosevelt Forest</themekey>
Keywords:
What is Metadata
Use Authority Files and Standard Vocabulary
Photo by mxgirl2014 on flickr
 Global Change Master Directory
 Geographic Names Information System
 Getty Thesaurus of Geographic Names
 ISO 19115 Topic Category Thesaurus
✗
What is Metadata
Acronyms
Spell out acronyms with first
use. Many acronyms have
multiple meanings (e.g., DOI)
Use widely known acronyms
only when it corresponds to
specific metadata fields such as
file formats (e.g.,TIFF, JPEG,
PDF)
What is Metadata
Provide all of the critical information for discovery, understanding,
and reuse:
• Identification Information
• Entities & Attributes
• Data Quality
• Access, Use & Liability Constraints
• Distribution
• Spatial References
What is Metadata
Provide all of the critical information for: Identification
What is Metadata
Provide all of the critical information for: Entity / Attribute
What is Metadata
Provide all of the critical information for: Data Quality
Inform
- Accuracy
- Consistency
- Completeness
What is Metadata
Provide all of the critical information for: Data Lineage
What is Metadata
Provide all of the critical information for: Access, Use, & Liability
Constraints
Access Constraints: restrictions and legal prerequisites for access the data.
Use Constraints: restrictions and legal prerequisites for using the data after access is granted.
Example: Use_Constraints:
Users are free to use, copy, distribute, transmit, and adapt the work for commercial and non-
commercial purposes, without restriction, as long as clear attribution of the source is provided.
Distribution Liability: statement of the liability assumed by the distributor with respect to
content and accuracy of the data.
Example: Distribution_Liability:
Unless otherwise stated, all data, metadata and related materials are considered to satisfy the quality
standards relative to the purpose for which the data were collected. Although these data and associated
metadata have been reviewed for accuracy and completeness and approved for release by the U.S.
Geological Survey (USGS), no warranty expressed or implied is made regarding the display or utility of the
data on any other system or for general or scientific purposes, nor shall the act of distribution constitute any
such warranty.
What is Metadata
Provide all of the critical information for: Accessing the Data
What is Metadata
Provide all of the critical information for: Spatial Reference
What is Metadata
1. Organize your information
• Did you write a project abstract to obtain funding for your proposal? Re-
use it in your metadata!
• Did you use a lab notebook or other notes during the data development
process that define measurements and other parameters?
• Do you have the contact information for colleagues you worked with?
• What about citations for other data sources you used in your project?
2. Write your metadata using a metadata tool
3. Review for accuracy and completeness
4. Have someone else read your record
5. Revise the record, based on comments from your reviewer
6. Review once more before you publish
What is Metadata
Titles, Titles, Titles…
• Titles are critical in helping readers find your data
o While individuals are searching for the most appropriate
datasets, they are most likely going to use the title as the
first criteria to determine if a dataset meets their needs.
o Treat the title as the opportunity to sell your dataset.
• A complete title includes: What, Where, When, Who, and
Scale
• An informative title includes: topic, timeliness of the data,
specific information about place and geography
What is Metadata
A Clear Choice: Which title is better?
• Rivers
OR
• Greater Yellowstone Rivers from 1:126,700 U.S. Forest
Service Visitor Maps (1961-1983)
Greater Yellowstone (where) Rivers (what) from 1:126,700
(scale) U.S. Forest Service (who) Visitor Maps (1961-1983)
(when)
CCimagebydolfion
Flickr
What is Metadata
• Be specific and quantify when you can! The goal of a
metadata record is to give the user enough information to
know if they can use the data without contacting the
dataset owner.
Vague: We checked our work and it
looks complete.
Specific: We checked our work
using a random sample of 5 monitoring
sites reviewed by 2 different people. We
determined our work to be 95%
complete based on these visual
inspections.
CCimagebyPNASHonFlickr
What is Metadata
• Use descriptive and clear writing
• Fully document geographic locations
• Select keywords wisely
• Use thesauri for keywords whenever possible
• Be detailed: there’s no such thing as too much metadata!
CCimagebyMarcoArmentonFlickr
What is Metadata
• Remember: a computer will read your metadata
• Do not use symbols that could be misinterpreted by
software: Examples: ! @ # % { } | /  < > ~
• Don’t use tabs, indents, or line feeds/carriage returns
• When copying and pasting from other sources, use a text
editor (e.g., Notepad) to eliminate hidden characters
What is Metadata
• Metadata is documentation of data
• A metadata record captures critical information about the
content of a dataset
• Metadata allows data to be discovered, accessed, and re-
used
• A metadata standard provides structure and consistency to
data documentation
• Standards and tools vary – select according to defined
criteria such as data type, organizational guidance, and
available resources
• Metadata is of critical importance to data developers, data
users, and organizations
• Metadata completes a dataset.
What is Metadata
· Federal Policies:
o Executive Order 12906
o M-13-13 Open Data Policy
· Data Catalogs:
o DataONE
o USGS Science Data Catalog
o Data.gov
More about CSDGM & ISO 19115:
o FGDC Geospatial Metadata Website
· Metadata Tools:
o Metadata Wizard✓
o TKME✓*
o CatMDEdit
o GRIIDC Metadata Editor✓
o ArcGIS 10.2✓
by J B on flickr
· Standard Vocabularies
· USGS Thesaurus
· Global Change Master Directory
· Geographic Names Information System
· Getty Thesaurus of Geographic Names Supports Biological Data Profile
* Supports Biological Data Profile, Shoreline Profile, and
Remote Sensing Extension
✓ Provides validation
What is Metadata
The full slide deck may be downloaded from:
http://www.dataone.org/education-modules
Suggested citation:
DataONE Education Module: Metadata. DataONE. Retrieved
Nov12, 2016. From
http://www.dataone.org/sites/all/documents/L07_Metadata.pptx
Copyright license information:
No rights reserved; you may enhance and reuse for
your own purposes. We do ask that you provide
appropriate citation and attribution to DataONE.

More Related Content

PPTX
DataONE Education Module 03: Data Management Planning
PPTX
DataONE Education Module 09: Analysis and Workflows
PPTX
DataONE Education Module 02: Data Sharing
PPTX
DataONE Education Module 08: Data Citation
PPTX
DataONE Education Module 01: Why Data Management?
PPTX
DataONE Education Module 10: Legal and Policy Issues
PPTX
Research Data Management for SOE
PDF
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
DataONE Education Module 03: Data Management Planning
DataONE Education Module 09: Analysis and Workflows
DataONE Education Module 02: Data Sharing
DataONE Education Module 08: Data Citation
DataONE Education Module 01: Why Data Management?
DataONE Education Module 10: Legal and Policy Issues
Research Data Management for SOE
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...

What's hot (20)

PPTX
Research data life cycle
PDF
Dats nih-dccpc-kc7-april2018-prs-uoxf
PPT
Managing data throughout the research lifecycle
PDF
Basics of Research Data Management
PPTX
Data Services presentation for Psychology
PDF
Va sla nov 15 final
PDF
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
PPTX
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
PPTX
Introduction to Data Management
PDF
Data Management Lab: Session 1 Slides
PPTX
Data Literacy: Creating and Managing Reserach Data
PDF
Preparing your data for sharing and publishing
PDF
MANTRA Research Data Lifecycle
PPTX
EDI Training Module 4: Organizing Data Into Publishable Units
PDF
Preparing Data for Sharing: The FAIR Principles
PDF
Poster RDAP13: Research Data in eCommons @ Cornell: Present and Future
PPTX
Summary of data citation synthesis activity & Review
PDF
FAIR Data Knowledge Graphs–from Theory to Practice
PPT
David Shotton - Research Integrity: Integrity of the published record
PPTX
Research Data Management and Librarians
Research data life cycle
Dats nih-dccpc-kc7-april2018-prs-uoxf
Managing data throughout the research lifecycle
Basics of Research Data Management
Data Services presentation for Psychology
Va sla nov 15 final
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
Introduction to Data Management
Data Management Lab: Session 1 Slides
Data Literacy: Creating and Managing Reserach Data
Preparing your data for sharing and publishing
MANTRA Research Data Lifecycle
EDI Training Module 4: Organizing Data Into Publishable Units
Preparing Data for Sharing: The FAIR Principles
Poster RDAP13: Research Data in eCommons @ Cornell: Present and Future
Summary of data citation synthesis activity & Review
FAIR Data Knowledge Graphs–from Theory to Practice
David Shotton - Research Integrity: Integrity of the published record
Research Data Management and Librarians
Ad

Similar to DataONE Education Module 07: Metadata (20)

PPTX
L07 metadata
PPTX
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
PPTX
Data Management Best Practices
PDF
Metadata Strategies - Data Squared
PPTX
Intro to Data Management
PPTX
FSCI Data Discovery
PDF
Metadata Strategies
PPTX
Managing your data paget
PDF
Data Profiling, Data Catalogs and Metadata Harmonisation
PDF
Data accessibilityandchallenges
PPTX
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
PPTX
Love Your Data Locally
PPTX
Introduction to Big Data Analytics
PDF
Data Systems Integration & Business Value Pt. 1: Metadata
PDF
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
PPT
Dma unit 1
PPTX
FAIRDOM data management support for ERACoBioTech Proposals
PDF
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
PDF
DLBDSIDS01_E_Session 2 dATA sCIENCES pRÄSO
PPT
Summary of workshop.ppt
L07 metadata
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Data Management Best Practices
Metadata Strategies - Data Squared
Intro to Data Management
FSCI Data Discovery
Metadata Strategies
Managing your data paget
Data Profiling, Data Catalogs and Metadata Harmonisation
Data accessibilityandchallenges
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Love Your Data Locally
Introduction to Big Data Analytics
Data Systems Integration & Business Value Pt. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Dma unit 1
FAIRDOM data management support for ERACoBioTech Proposals
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
DLBDSIDS01_E_Session 2 dATA sCIENCES pRÄSO
Summary of workshop.ppt
Ad

Recently uploaded (20)

PPTX
4. Diagnosis and treatment planning in RPD.pptx
PPTX
Climate Change and Its Global Impact.pptx
PPTX
Designing Adaptive Learning Paths in Virtual Learning Environments
PDF
Disorder of Endocrine system (1).pdfyyhyyyy
PDF
Chevening Scholarship Application and Interview Preparation Guide
PPTX
Why I Am A Baptist, History of the Baptist, The Baptist Distinctives, 1st Bap...
PDF
Farming Based Livelihood Systems English Notes
PDF
Everyday Spelling and Grammar by Kathi Wyldeck
PDF
Diabetes Mellitus , types , clinical picture, investigation and managment
PDF
Laparoscopic Dissection Techniques at WLH
PDF
Solved Past paper of Pediatric Health Nursing PHN BS Nursing 5th Semester
PDF
anganwadi services for the b.sc nursing and GNM
PPTX
Reproductive system-Human anatomy and physiology
PDF
Hospital Case Study .architecture design
PDF
Physical education and sports and CWSN notes
PPT
hemostasis and its significance, physiology
PDF
The TKT Course. Modules 1, 2, 3.for self study
PPTX
BSCE 2 NIGHT (CHAPTER 2) just cases.pptx
PDF
faiz-khans about Radiotherapy Physics-02.pdf
PPTX
principlesofmanagementsem1slides-131211060335-phpapp01 (1).ppt
4. Diagnosis and treatment planning in RPD.pptx
Climate Change and Its Global Impact.pptx
Designing Adaptive Learning Paths in Virtual Learning Environments
Disorder of Endocrine system (1).pdfyyhyyyy
Chevening Scholarship Application and Interview Preparation Guide
Why I Am A Baptist, History of the Baptist, The Baptist Distinctives, 1st Bap...
Farming Based Livelihood Systems English Notes
Everyday Spelling and Grammar by Kathi Wyldeck
Diabetes Mellitus , types , clinical picture, investigation and managment
Laparoscopic Dissection Techniques at WLH
Solved Past paper of Pediatric Health Nursing PHN BS Nursing 5th Semester
anganwadi services for the b.sc nursing and GNM
Reproductive system-Human anatomy and physiology
Hospital Case Study .architecture design
Physical education and sports and CWSN notes
hemostasis and its significance, physiology
The TKT Course. Modules 1, 2, 3.for self study
BSCE 2 NIGHT (CHAPTER 2) just cases.pptx
faiz-khans about Radiotherapy Physics-02.pdf
principlesofmanagementsem1slides-131211060335-phpapp01 (1).ppt

DataONE Education Module 07: Metadata

  • 1. What is Metadata Lesson 7: Metadata CCimagebybonusonFlickr
  • 2. What is Metadata • Explanation of metadata • Illustrate the value and utility of metadata to data users, data providers, and organizations • Examine information included in a metadata record • Examples of metadata standards and how to choose • Preparing to write metadata • Tips for writing a quality metadata record CCimagebyAlecCouros onFlickr
  • 3. What is Metadata After completing this lesson, the participant will be able to: • Identify and list the types of information typically included in metadata records for environmental datasets • Identify 3 reasons metadata is of value to data users, data developers, and organizations • List 3 uses for metadata, beyond discovery of data • Identify and describe factors that may determine which metadata standards are most appropriate for a given dataset • List steps to prepare to write metadata • Explain how to write good metadata
  • 5. What is Metadata Metadata is: Data ‘reporting’ • WHO created the data? • WHAT is the content of the data? • WHEN were the data created? • WHERE is it geographically? • HOW were the data developed? • WHY were the data developed? PhotobyMichelleChang.AllRightsReserved
  • 6. What is Metadata • Metadata is all around… Author(s) Boullosa, Carmen. Title(s) They're cows, we're pigs / by Carmen Boullosa Place New York : Grove Press, 1997. Physical Descr viii, 180 p ; 22 cm. Subject(s) Pirates Caribbean Area Fiction. Format Fiction CCimagebyUSDAgovonFlickr CCimagebyMskaduonFlickr
  • 7. What is Metadata · Data Discovery Metadata: captures information USGS Science Data Catalog: enabling discovery DataONE: enables exchange
  • 8. What is Metadata InformationContent Time · Scientific Understanding and Reuse Time of data development Accident Retirement or career change Death (modified from Michener et al. 1997) Specific details General details
  • 9. What is Metadata · Defending policy decisions based on data • Regulatory decisions based on undocumented data are not defensible • Metadata accuracy and details are important as supporting evidence for the science and policy Controversies arise when metadata are incomplete and/or absent
  • 11. What is Metadata Metadata allows data developers to: • Avoid data duplication • Share reliable information • Publicize efforts – promote the work of a scientist and his/her contributions to a field of study • Metadata reuse saves time and resources in the long-run CCimagebyUSEmbassyGuyanaonFlickr
  • 12. What is Metadata Metadata gives a user the ability to: • Search, retrieve, and evaluate dataset information from both inside and outside an organization • Find data: Determine what data exists for a geographic location and/or topic • Determine applicability: Decide if a dataset meets a particular need • Discover how to acquire the dataset identified; process and use the dataset • Understand the dataset, including definitions of column names, or expected numerical ranges found in the data CCimagebyASEEonFlickr
  • 13. What is Metadata • Metadata helps ensure an organization’s investment in data: o Documentation of data processing steps, quality control, definitions, data uses, and restrictions o Ability to use data after initial intended purpose o Allows organization to track data use and facilitates publication • Transcends people and time: o Offers data permanence o Creates institutional memory • Advertises an organization’s research: o Creates possible new partnerships and collaborations through data sharing CCimagebymambolonFlickr
  • 14. What is Metadata Metadata can support: data distribution data management
  • 15. What is Metadata The descriptive content of the metadata file can be used to identify, assess, and access available data resources. • online access • order process • contacts ACCESS • use constraints • access constraints • data quality • availability/pricing ASSESS • keywords • geographic location • time period • attributes IDENTIFY
  • 16. What is Metadata Examples of metadata search catalogs: o DataONE • Data discovery, knowledge, community…for a sustainable future • https://search.dataone.org o Data.gov • Federal e-gov geospatial data portal • http://www.geo.data.gov o Metacat • Repository for data and metadata • http://knb.ecoinformatics.org/index.jsp o US Geological Survey • USGS Science Data Catalog • http://data.usgs.gov/datacatalog o ArcGIS Online • ESRI sponsored national geospatial data portal • http://www.geographynetwork.com CCimagebyRGB12onFlickr
  • 18. What is Metadata • Metadata records can be used to track data provenance accurately • Data Maintenance: o Are the data current? o Are the data in a reliable format? o Where are the data stored? • Data Update: o Contact information o Distribution policies, availability, pricing, URLs o New derivations of the dataset
  • 19. What is Metadata • Metadata allows you to repeat a scientific process if: o methodologies are defined o variables are defined o analytical parameters are defined • Metadata allows you to defend your scientific process: o demonstrate process o increasingly data savvy public requires metadata for consumer information INPUT RESULTS
  • 20. What is Metadata Metadata is a declaration of: • Purpose – the originator’s intended application of the data • Use Constraints - inappropriate applications of the data • Completeness - features or geographies excluded from the data • Distribution Liability - explicit liability of the data producer and assumed liability of the consumer What to do… What not to do…
  • 21. What is Metadata Even if the value of data documentation is recognized, researchers are often concerned about the effort required to create metadata that effectively describe their data. CCimagebywaterlilysageonFlickr
  • 22. What is Metadata Concern Solution workload required to capture accurate robust metadata incorporate metadata creation into data development process – distribute the effort time and resources to create, manage, and maintain metadata include in grant budget and schedule readability / usability of metadata use a standardized metadata format discipline specific information and ontologies Use a standard ‘profile’ that supports discipline specific information
  • 23. What is Metadata • A Standard provides a structure to describe data with: o Common terms to allow consistency between records o Common definitions for easier interpretation o Common language for ease of communication o Common structure to quickly locate information • In search and retrieval, standards provide: o Documentation structure in a reliable and predictable format for computer interpretation o A uniform summary description of the dataset CCimageby ccarlsteadonFlickr
  • 24. What is Metadata Components of metadata: • A metadata standard is made up of defined elements, including the type of information the user should enter (e.g. text, numbers, date). • Examples of elements include Title, Abstract, Keyword, Online Link
  • 27. What is Metadata • Dublin Core Element Set o Emphasis on web resources, publications o http://dublincore.org/documents/dces/ • FGDC Content Standard for Digital Geospatial Metadata (CSDGM) o Emphasis on geospatial data o The Biological Data Profile (BDP) of the CSDGM is a profile to the CSDGM with an emphasis on biological data (and geospatial) ohttps://www.fgdc.gov/metadata/csdgm-standard • ISO 19115/19139 Geographic information – metadata o Emphasis on geospatial data and services o https://www.fgdc.gov/metadata/iso-standards
  • 28. What is Metadata • Ecological Metadata Language (EML) o Focus on ecological data o http://knb.ecoinformatics.org/eml_metadata_guide.html • Darwin Core oEmphasis on museum specimens o http://rs.tdwg.org/dwc/index.htm • Geography Markup Language (GML) o Emphasis on geographic features (roads, highways, bridges) o http://www.opengeospatial.org/standards/gml
  • 29. What is Metadata Ecological Metadata Language (EML) FGDC Content Standard for Digital Geospatial Metadata Title Title Abstract Abstract Entity Description Entity Type Definition Intellectual Rights Use Constraints Terminology for the same concepts may vary across standards
  • 30. What is Metadata • Many standards collect similar information • Factors to consider: o Your data type: • Are you working mainly with GIS data? Raster/vector or point data? Do you have biological or shoreline information in your dataset? - Consider the FGDC Content Standard for Digital Geospatial Metadata with one of its profiles: the Biological Data Profile or the Shoreline Data Profile. • Are you working with data retrieved from instruments such as monitoring stations or satellites? Are you using geospatial data services such as applications for web-mapping applications or data modeling? - If so, then consider using the ISO 19115-2 standard • Are you mainly working with ecological data? - Consider Ecological Metadata Language (EML)
  • 31. What is Metadata • More Factors to consider: o Your organization’s policies: do they state which standard to use? o What resources are available to create metadata? Examples of Tools: • FGDC CSDGM: - https://www.fgdc.gov/metadata/geospatial-metadata- tools#availabletools • EML: - Morpho (http://knb.ecoinformatics.org/morphoportal.jsp) • ISO: (http://www.fgdc.gov/metadata/iso-metadata-editor-review) - XML Spy or Oxygen - CatMD o Other factors: Availability of human support; instructional materials; use of controlled vocabularies; output formats
  • 32. What is Metadata Metadata are developed continuously throughout the entire data lifecycle Plan Collect Assure Describe Preserve Discover Integrate Analyze
  • 33. What is Metadata Consistency with commonly used fields <publish>U.S. Geological Survey</publish> <publish>USGS</publish> ✔ ✗Publisher: <pubdate>YYYYMMDD</pubdate> <pubdate>YYYY</pubdate> <pubdate>MM/DD/YYYY</pubdate> <pubdate>May 27, 2003</pubdate> Date: Examples for a FGDC CSDGM record: <placekt>Geographic Names Information System</placekt> <placekey>Roosevelt National Forest</placekey> <themekey>Roosevelt Forest</themekey> Keywords:
  • 34. What is Metadata Use Authority Files and Standard Vocabulary Photo by mxgirl2014 on flickr  Global Change Master Directory  Geographic Names Information System  Getty Thesaurus of Geographic Names  ISO 19115 Topic Category Thesaurus ✗
  • 35. What is Metadata Acronyms Spell out acronyms with first use. Many acronyms have multiple meanings (e.g., DOI) Use widely known acronyms only when it corresponds to specific metadata fields such as file formats (e.g.,TIFF, JPEG, PDF)
  • 36. What is Metadata Provide all of the critical information for discovery, understanding, and reuse: • Identification Information • Entities & Attributes • Data Quality • Access, Use & Liability Constraints • Distribution • Spatial References
  • 37. What is Metadata Provide all of the critical information for: Identification
  • 38. What is Metadata Provide all of the critical information for: Entity / Attribute
  • 39. What is Metadata Provide all of the critical information for: Data Quality Inform - Accuracy - Consistency - Completeness
  • 40. What is Metadata Provide all of the critical information for: Data Lineage
  • 41. What is Metadata Provide all of the critical information for: Access, Use, & Liability Constraints Access Constraints: restrictions and legal prerequisites for access the data. Use Constraints: restrictions and legal prerequisites for using the data after access is granted. Example: Use_Constraints: Users are free to use, copy, distribute, transmit, and adapt the work for commercial and non- commercial purposes, without restriction, as long as clear attribution of the source is provided. Distribution Liability: statement of the liability assumed by the distributor with respect to content and accuracy of the data. Example: Distribution_Liability: Unless otherwise stated, all data, metadata and related materials are considered to satisfy the quality standards relative to the purpose for which the data were collected. Although these data and associated metadata have been reviewed for accuracy and completeness and approved for release by the U.S. Geological Survey (USGS), no warranty expressed or implied is made regarding the display or utility of the data on any other system or for general or scientific purposes, nor shall the act of distribution constitute any such warranty.
  • 42. What is Metadata Provide all of the critical information for: Accessing the Data
  • 43. What is Metadata Provide all of the critical information for: Spatial Reference
  • 44. What is Metadata 1. Organize your information • Did you write a project abstract to obtain funding for your proposal? Re- use it in your metadata! • Did you use a lab notebook or other notes during the data development process that define measurements and other parameters? • Do you have the contact information for colleagues you worked with? • What about citations for other data sources you used in your project? 2. Write your metadata using a metadata tool 3. Review for accuracy and completeness 4. Have someone else read your record 5. Revise the record, based on comments from your reviewer 6. Review once more before you publish
  • 45. What is Metadata Titles, Titles, Titles… • Titles are critical in helping readers find your data o While individuals are searching for the most appropriate datasets, they are most likely going to use the title as the first criteria to determine if a dataset meets their needs. o Treat the title as the opportunity to sell your dataset. • A complete title includes: What, Where, When, Who, and Scale • An informative title includes: topic, timeliness of the data, specific information about place and geography
  • 46. What is Metadata A Clear Choice: Which title is better? • Rivers OR • Greater Yellowstone Rivers from 1:126,700 U.S. Forest Service Visitor Maps (1961-1983) Greater Yellowstone (where) Rivers (what) from 1:126,700 (scale) U.S. Forest Service (who) Visitor Maps (1961-1983) (when) CCimagebydolfion Flickr
  • 47. What is Metadata • Be specific and quantify when you can! The goal of a metadata record is to give the user enough information to know if they can use the data without contacting the dataset owner. Vague: We checked our work and it looks complete. Specific: We checked our work using a random sample of 5 monitoring sites reviewed by 2 different people. We determined our work to be 95% complete based on these visual inspections. CCimagebyPNASHonFlickr
  • 48. What is Metadata • Use descriptive and clear writing • Fully document geographic locations • Select keywords wisely • Use thesauri for keywords whenever possible • Be detailed: there’s no such thing as too much metadata! CCimagebyMarcoArmentonFlickr
  • 49. What is Metadata • Remember: a computer will read your metadata • Do not use symbols that could be misinterpreted by software: Examples: ! @ # % { } | / < > ~ • Don’t use tabs, indents, or line feeds/carriage returns • When copying and pasting from other sources, use a text editor (e.g., Notepad) to eliminate hidden characters
  • 50. What is Metadata • Metadata is documentation of data • A metadata record captures critical information about the content of a dataset • Metadata allows data to be discovered, accessed, and re- used • A metadata standard provides structure and consistency to data documentation • Standards and tools vary – select according to defined criteria such as data type, organizational guidance, and available resources • Metadata is of critical importance to data developers, data users, and organizations • Metadata completes a dataset.
  • 51. What is Metadata · Federal Policies: o Executive Order 12906 o M-13-13 Open Data Policy · Data Catalogs: o DataONE o USGS Science Data Catalog o Data.gov More about CSDGM & ISO 19115: o FGDC Geospatial Metadata Website · Metadata Tools: o Metadata Wizard✓ o TKME✓* o CatMDEdit o GRIIDC Metadata Editor✓ o ArcGIS 10.2✓ by J B on flickr · Standard Vocabularies · USGS Thesaurus · Global Change Master Directory · Geographic Names Information System · Getty Thesaurus of Geographic Names Supports Biological Data Profile * Supports Biological Data Profile, Shoreline Profile, and Remote Sensing Extension ✓ Provides validation
  • 52. What is Metadata The full slide deck may be downloaded from: http://www.dataone.org/education-modules Suggested citation: DataONE Education Module: Metadata. DataONE. Retrieved Nov12, 2016. From http://www.dataone.org/sites/all/documents/L07_Metadata.pptx Copyright license information: No rights reserved; you may enhance and reuse for your own purposes. We do ask that you provide appropriate citation and attribution to DataONE.

Editor's Notes

  • #3: In this segment of the course we will cover:What is metadata?What are examples of metadata in our daily lives? And what information needs to be included in a metadata record?
  • #6: Data collection in the field is recorded in a wide variety of ways, including field notebooks, streaming data from satellites, data created from models, etc
  • #7: After returning from the field, scientists will transfer field notes into spreadsheets and other types of databases in preparation for their data analysis. Displayed here is a partial copy of a data set taken from the website “Frog Watch”. Notice there is no indication of Celsius or Fahrenheit in the “temperature” column. This is a simple example of how it is difficult to understand a dataset without all of the information.
  • #8: Once scientists have collected and analyzed data, they publish their conclusions in appropriate science journals.
  • #9: A dataset is a collection of data. Often datasets are considered spatial or tabular. However, many tabular datasets are inherently spatial – they represent spatial information. There are a variety of elements that can be found in a dataset including values, measures, points, conditions, qualities, frequencies or attributes.
  • #10: If you were to share your data, what type of information would be most useful to understand the data set?Alternatively, whenreceiving data from an external source, what information is needed to understand the data set? Metadata containsinformation about the dataset that allows it to be understood when shared amongst scientists.
  • #11: When sharing data, some considerations include: - why the data was created; - what limitations, if any, the data have;. - what the data means; and who should be cited if someone publishes something that utilized the data.When receiving data from an alternative source, consider: What are the data gaps?What processes were used for creating the current data?Are there any fees associated with the data?In what scale were the data created? What do the values in the tables mean?What software do I need in order to read the data?What projection is the data in?Can I give this data to someone else?Metadata contain information about a data set, in a standardized format, such that it can be understood and re-used.
  • #12: Metadata is data about data. It describes the content, quality, condition, and other characteristics of a dataset. Metadata records answer questions such as: Why was the data set created? What processes were used to create the data set? What projection is the data in? When was the data last updated? Who created the data? What scale was used? What fields are in the table? What do the values in those fields mean? Who do I contact about getting more information about the data? How do I obtain a copy of the data? Do the data cost anything? Are there any limitations to the data?Metadata is a valuable tool. Metadata records preserve the usefulness of data over time by detailing methods for data collection and data set creation. Metadata greatly minimizes duplication of effort in the collection of expensive digital data and fosters the sharing of digital data resources.
  • #13: Metadata is all around us. . .from Mp3 players, to nutrition labels, to library card catalogues.For example, a card catalogue tell us more information than just the title of the book, they also tells the user: Who is the author? Who published the book? What subject area does the book fall in? And finally, where is it located in the library? Another example of metadata that we see in our daily lives is the nutrition and ingredient information on food labels.Nutrition labels answer questions such as: What ingredients were used? Who made the food? How many calories per serving? How many servings in the can? What percentage of daily vitamins are in each serving?
  • #14: An established standard provides common terms,definitions and structure that allow for consistent communication. The use of standards also support search and retrieval in automated systems.
  • #15: This is an example of a metadata record using the Federal Geographic Data Committee (FGDC) standard.
  • #16: Metadata is useful to Data Users, Data Developers, and Organizations. In this era of data sharing, collaboration, and need for information organization, metadata can serve multiple purposes.
  • #17: Even if the value of data documentation is recognized, concerns remain as to the effort required to create metadata that effectively describes the data.
  • #18: Metadata does require time and effort to create. The workload, however, is reduced when metadata creation is incorporated into the data development process and the effort is distributed among data contributors. Metadata creation and management should be treated as a standard data development procedure and resources for staff and time should be included in project and proposal work plans and budgets. The use of a standardized metadata format and the development of discipline specific ‘profiles’ of metadata can enable data users to quickly find needed information and address data developer concerns about metadata use and comprehension.
  • #19: What value does metadata have to Data Developers?Metadata records will help avoid data duplication because researchers can determine if data already exists. Scientists are able to share reliable information about a dataset by creating metadata and passing it along with the dataset. Scientists wishing to reuse a dataset can be confident of its origins, data quality, and other valuable information about the data. Metadata also allow data creators to publicize the valuable data they have collected by making the metadata available on clearinghouses and other publically available venues. Metadata can be used in citation practices, thus increasing the visibility of the data.
  • #20: Metadata allows the user to search for and access data from a variety of sources. A search for metadata can be constricted to a geographic boundary, thus showing the user what data has been collected in a particular region. Metadata records help users determine whether the data will be applicable for use in a particular study. Finally, metadata records are of value to data users, because they determine how a dataset can be acquired, and if there are any restrictions on how the data can be used.
  • #21: An organization that keeps current metadata can benefit in many ways. Metadata records help ensure the organization’s investment in the data by retaining information about how the data was collected, processed, and quality controlled. This creates a permanent record of the dataset –which is critical institutional memory. When researchers leave or retire, metadata allows the dataset to “live on” for the organization. The data may be reused in another research project in the future, and future researchers in the organization will need to know how the dataset was created. Finally, metadata advertises an organization’s research, creating new potential partnerships and collaboration thru data sharing.
  • #22: This graph illustrates the phenomenon of “information entropy”, associated with research. At the time of the research project, a scientists memory is fresh. Details about the development of the dataset are easily recalled, and it is a good time to document information about the process. Over time, memory of the details begins to fade. A variety of circumstances can intervene, and eventually detailed knowledge about the dataset fades. Without a metadata record, this data might be unusable. A dataset it not considered complete without a metadata record to accompany it.
  • #23: Sound data management is best achieved with making metadata creation a part of the workflow. Not only can it keep the individual scientist organized, but the data has a much better chance of being re-used by future scientists.
  • #24: Metadata is very beneficial because it can be used to support data distribution, data management, and project management. To be best utilized, metadata should be considered a component of the data, created during the development of the data, and populated with rich content.
  • #25: Metadata supports data distribution through discovery, publication, and data portals.
  • #26: Metadata serves data discovery at multiple levels:initial identification by query of keywords, location, time, and attributes a quick assessment can be made by the scientist as to how useful the data is for a project by reading the access and use constraints; data quality measures of positional and attribute accuracy and sources used; and statements as to data availability, format and pricing a user can find out how to access the by reading access instructions, any standard order process instructions, and contact information for the dataset.
  • #27: A collection of metadata records can be published to the web in a variety of formats including:A website catalogcreation of a web accessible folder (waf) that is later havested a Z39.50 clearinghouse metadata services such as ESRI ArcIMS Metadata Service a geospatial data portal
  • #28: Geospatial data portals are plentiful, and contain easily accessible metadata collections from a variety of institutions.
  • #29: The USGS Core Science Metadata Clearinghouse is an example of a metadata repository, available to all researchers.
  • #30: Metadata supports data management in a variety of ways because metadata records can be used to assist data maintenance and update, accountability, data liability, and discovery and reuse.
  • #31: For data management, metadata records can be queried to determine:do we have data older than 10 years?do we have data that was before some political or geophysical event resulted in significant change?do we have data that used some older or now invalid data as a source?do we have data that used older or now invalid methods?Global edits to contacts, policies, URLS, and information about new derivations of the data set can be included in metadata records, thus assisting the data management process.
  • #32: It is important to note that metadata is not only useful for those trying to find and reuse good data, it is also very useful to the researcher in managing his/her own data.
  • #33: If you create robust metadata, you can use information contained in the metadata to locate and re-use data. The metadata contains information about themes, geographic location, time ranges, analytical methods, sources, and data quality.
  • #34: Metadata allows you to repeat a scientific process if methodologies, variables, and analytical parameters are well defined. It allows you to defend and demonstrate scientific process. A defensible process enables you to demonstrate the methodologies that led to decisions using the data. Increasingly the savvy public demand metadata with datasets for consumer information purposes.
  • #35: Metadata creation requires that you are accountable for your data and can document everything you know and do not know about your dataset.
  • #36: Metadata is invaluable for data liability. For example, a record will indicate the purpose – why the data was collected, any use constraints that are associated with the data, how complete the dataset is, and who is liable once the data is distributed and reused.
  • #37: Metadata records support project management activities in a variety of ways including project planning, monitoring, coordination, and deliverables.
  • #38: Metadata can serve as a project design document that establishes the project’s intent, extents, suggested source data resources, and database design. By doing so, project expectations are clearly outlined, metadata is immediately integrated into the process and the record serves as a medium to record project progress.
  • #39: If metadata is created during the course of the research, the record can be used to monitor: where the data set is in the data development process any problems associated with data quality that should be addressed prior to further development any problems with proposed methods and/or sources that will require a change in approach to data analysis.
  • #40: Metadata can be a means to improve communications among project participants.A project metadata template can be developed for to establish descriptors, parameters (time, geography, species, etc.), vocabularies, contact info, entity/attributes and distribution information.If actively used by the team, the metadata can be a source for identifying and tracking the use of new source data, analytical methods, and adding in information pertinent to the project used by other project participants.
  • #41: Metadata should be specified as a component of any data deliverable. If a part of a project is contracted out, include metadata as a deliverable in the contract. Be sure to specify the standard required and the level of completion required in the record itself. It is not enough to say ‘compliant metadata’ as you will likely get minimal information in the record. Instead, provide clear guidance and, preferably, a sample record that provides core info (Who should be named as Originator, what liability and use constraints statements to use) and illustrates the level of detail expected.
  • #43: There are many standards available to document data. Each has a different focus, yet ask for similar information about the data set.
  • #44: There are many standards available to document data. Each has a different focus, yet ask for similar information about the data set.
  • #47: More Factors to consider: Your organization’s policies: do they state which standard to use? What resources are available to create metadata? Examples of Tools:FGDC CSDGM: Mermaid (NOAA) http://www.ncddc.noaa.gov/metadata-standards/mermaid/Metavist (Forest Service) http://ncrs.fs.fed.us/pubs/viewpub.asp?key=2737TKME (USGS) http://geology.usgs.gov/tools/metadata/tools/doc/tkme.html EML: Morpho (http://knb.ecoinformatics.org/morphoportal.jsp)ISO: (http://www.fgdc.gov/metadata/iso-metadata-editor-review) XML Spy or OxegynCatMDOther factors: Availability of human support; instructional materials; use of controlled vocabularies; output formats
  • #48: Metadata is documentation of dataA metadata record captures critical information about the content of a datasetMetadata allows data to be discovered, accessed, and re-usedA metadata standard provides structure and consistency to data documentationStandards and tools vary – select according to defined criteria such as data type, organizational guidance, and available resourcesMetadata is of critical importance to data developers, data users, and organizationsMetadata can be effectively used for:data distributiondata managementproject managementMetadata completes a dataset.