Preparing your data for sharing
and publishing
Varsha Khodiyar, PhD
MRC Cognition and Brain Sciences Unit
Open Science Day 20.11.2018
Howchameleonschangecolour
1
7719 respondents
White paper available from
https://doi.org/10.6084/m9.figshare.5975011
Survey data available from
https://doi.org/10.6084/m9.figshare.5971387
What are researchers concerned about when sharing data?
2
3
44
How to organise your data
5
Make sure your data are well organised
• Data files and folders labelled in an understandable way.
• Data files and folders organised in a logical, easy-to-
follow manner.
• Any acronyms used for data file/folder names clearly
defined, ideally in a README file.
• Data files in a format that are easy for others to reuse /
use the standard format used by your discipline.
6
Increasing reproducibility
• Include any additional information needed to understand the data,
methods, parameters, e.g. which instrument (make and model) was
used to measure blood carbon dioxide levels?
• Include availability statements for any code that was used to view,
parse or analyse the data, in support of the conclusions.
7
Help to organise your data is available
Springer Nature Research Data Support
Researchers
submit their
data files
securely
The Research
Data team
curates the data
and metadata
The data are
published and
linked to the
author’s paper
More information is available on our website here:
http://www.springernature.com/gb/group/data-
policy/data-support-services
8
No one other than the
creator can access the
data, or even knows that
it exists
Before data curation: a researcher’s dataset in a desktop
folder
The dataset is
stored as an
Excel file in a
desktop folder
The file title is not
comprehensible to
anyone but the
creator
No description or
keywords
available
9
Before curation begins
Once received, we check to make sure
that the dataset is suitable for our
curation services. Multiple files in any
format are accepted.
After making these checks, we begin
the curation process. If necessary
we may recommend that the
dataset is split into smaller groups
or collections.
Pre-curation data checks:
 The data aren’t sensitive
 The data don’t include
direct or indirect human
identifiers
 The data shouldn’t be in a
community repository
10
After Springer Nature Research Data Support
Working with the researcher’s manuscript or published paper, we draft a comprehensive
metadata record for the dataset which is sent to the researcher for approval before
being published. Embargoes can be applied if necessary.
The curated dataset will be published with
its own metadata record which includes
rich descriptive information, reuse
conditions, licence, DOI, metrics and
keywords
(this example is
https://doi.org/10.6084/m9.figshare.5259
415)
1111
Choosing a repository to store data
12
Selecting a repository for your data
Considerations:
1. Is there a discipline-specific repository for the type of data you
have generated?
2. Will access to the data need to be controlled?
3. If no discipline-specific repository is available for the data,
does your funder or institute mandate deposition to a
particular repository?
13
Indexing services Curated lists
Sources to help choose a data repository
NEW! Tools to help select repositories
www.nature.com/sdata/data-policies/repositories
https://repositoryfinder.test.datacite.org/
1414
Sensitive data
15
• Consider an appropriate patient consent
framework
 Consent to use data in current study
 Consent to use data for future research
 Consent to share data for use by other
research groups
• Don’t collect more than you need
Collecting sensitive data
16
• Remove direct identifiers
• Aggregate indirect identifiers into groups where possible
• Anonymization or de-identification?
• Use controlled access repositories,
and consider:
 Data use agreement?
 Data access conditions?
Sharing sensitive data
1717
Scholarly credit for generating and sharing research data
18
Data Journals at Springer Nature
www.nature.com/scientificdata
https://bmcresnotes.biomedcentral.com
Data Descriptor
Open access
Sound science
Emphasis on enabling
data reuse
Data peer review
Data Note
Open access
Sound science
Short format
19
Scientific Data, a Nature Research journal
Data Descriptor
Primary article type; sound
science and facilitates data
reuse
Analysis
New analyses or meta-
analyses of existing data
Article
Original reports on
advances in data sharing &
reuse
Comment
Announcements of broad
interest; usually invited
www.nature.com/scientificdata
20
Under the hood of a Data Descriptor
• Context for data generation (background)
• How was data generated?
• How was data processed?
• Where is the data?
• Synthesis
• Analysis
• Conclusions
21
Data peer review
www.nature.com/sdata/policies/for-referees
Experimental
Rigor and
Technical Data
Quality
Were data produced in a sound manner?
Technical quality of data – appropriate statistical analyses?
Experimental rigor - appropriate depth, coverage?
Completeness
of the
Description
Sufficient detail to allow others to reproduce these steps?
Sufficient detail to allow others to reuse this data?
Consistent with relevant minimum reporting standards?
Integrity of the
Data Files and
Repository
Record
Do data files appear complete and match manuscript
descriptions?
Are data archived to the most appropriate repository?
22
What types of data can be published?
Decades old
dataset
Standalone
dataset
Data that has been
used in an analysis
article
Large
consortium
dataset
Data from a
single
experiment
Any data that the researcher
finds valuable and that others
might find useful too
Data associated with a
high impact analysis
article
23
When can a data paper be published?
After data
analysis has been
published
Before analysis has
been published
Authors not
intending to
analyse data
Data papers can be
submitted and published at
any point in the research
workflow, i.e. whenever it
makes most sense for your
data
After data
analysis has been
published
Before the
analysis has been
published
Publication alongside
analysis article
2424
Still unsure about research data?
25
What does research data training offer?
• Directly addresses the main challenges of data sharing
• Part of the Nature Research Academies, offering trusted quality
and value
• A unique perspective and trusted experience within the realm of
research data
• Training for both researchers & information professionals,
appropriate for all levels
• Courses are customised to meet your needs, and are brought to
you (and your researchers)
Springer Nature Research Data Training
Source: https://doi.org/10.6084/m9.figshare.5975011
26
Queries are answered within two business days
Run by members of the Springer Nature Research Data team
Expertise in data curation and management, archiving and
digital preservation, copyright and licensing, Open Access
publishing
Always encourage best practices, e.g. the use of community
repositories for specific data types
Email: researchdata@springernature.com
http://www.springernature.com/gp/group/data-policy/helpdesk
Springer Nature Research Data Helpdesk
2727
The story behind the image
How chameleons change colour
Chameleons are well known for their potential to
change colour but recent research on panther
chameleons is the first to find two layers of
crystal containing cells, each with a potentially
different purpose. Researchers from the
University of Geneva have speculated that the
deeper crystal containing cells may help with the
regulation of temperature, whilst the more
superficial layer of colour changing cells could be
responsible for camouflage or mating displays.
Thank you for listening
Varsha Khodiyar, PhD
Data Curation Manager, Springer Nature
(Data Curation Editor, Scientific Data)
varsha.khodiyar@nature.com

More Related Content

PPTX
Gaining credit for sharing research data: Viewpoints on Data Publishing
PPTX
DataONE Education Module 08: Data Citation
PPTX
DataONE Education Module 02: Data Sharing
PDF
Research data management at TU Eindhoven
PDF
Data sharing as part of the research ecosystem
PDF
What funders want you to do with your data
PPTX
Long-term storage – will it fill up with the good stuff, or the big, bad, an...
PPTX
Workflows for Publishing Data; Scientific Data's experience as an early adopter
Gaining credit for sharing research data: Viewpoints on Data Publishing
DataONE Education Module 08: Data Citation
DataONE Education Module 02: Data Sharing
Research data management at TU Eindhoven
Data sharing as part of the research ecosystem
What funders want you to do with your data
Long-term storage – will it fill up with the good stuff, or the big, bad, an...
Workflows for Publishing Data; Scientific Data's experience as an early adopter

What's hot (20)

PPTX
The challenge of sharing data well, how publishers can help
PDF
FAIR Data Knowledge Graphs–from Theory to Practice
PPTX
Shareable by Design: Making Better Use of your Research
PPTX
How to write a data management plan
PPTX
DataONE Education Module 01: Why Data Management?
PPTX
DataONE Education Module 07: Metadata
PPTX
Data management plan format
PPTX
EPSRC Policy Compliance: What researchers need to know
PPTX
Data Publishing at Harvard's Research Data Access Symposium
PPT
David Shotton - Research Integrity: Integrity of the published record
PPTX
DataONE Education Module 03: Data Management Planning
PPTX
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
PDF
Dataset Catalogs as a Foundation for FAIR* Data
PDF
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
PPTX
Research Data Management: An Overview - 2014-05-12 - Humanities Division, Uni...
PDF
The DataTags System: Sharing Sensitive Data with Confidence
PPTX
Introduction to RDM for Geoscience PhD Students
PPTX
Data Strategy and Services at the British Library: Data, Software and PIDs
PPT
Managing data throughout the research lifecycle
The challenge of sharing data well, how publishers can help
FAIR Data Knowledge Graphs–from Theory to Practice
Shareable by Design: Making Better Use of your Research
How to write a data management plan
DataONE Education Module 01: Why Data Management?
DataONE Education Module 07: Metadata
Data management plan format
EPSRC Policy Compliance: What researchers need to know
Data Publishing at Harvard's Research Data Access Symposium
David Shotton - Research Integrity: Integrity of the published record
DataONE Education Module 03: Data Management Planning
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
Dataset Catalogs as a Foundation for FAIR* Data
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Research Data Management: An Overview - 2014-05-12 - Humanities Division, Uni...
The DataTags System: Sharing Sensitive Data with Confidence
Introduction to RDM for Geoscience PhD Students
Data Strategy and Services at the British Library: Data, Software and PIDs
Managing data throughout the research lifecycle

Similar to Preparing your data for sharing and publishing (20)

PDF
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
PDF
Facilitating good research data management practice as part of scholarly publ...
PDF
Scientific Data and peer review session at Dryad event, May 2015
PDF
What role can publishers play in the open data ecosystem?
PPTX
Data peer review workshop
PDF
High quality data publications: drives and needs - Sansone, BDebate, 12 Nov 2014
PDF
Gaining credit for sharing research data
PDF
Data publication: Discover, Explore, Visualise
PDF
Data sharing as part of the research workflow
PDF
NC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
PDF
Open Access Week - Oxford, 20-24 Oct 2014
PPTX
How to share useful data
PPTX
A National Approach to Open Data in Ireland: Publishers and Research Data Man...
PPTX
Rebecca Grant - Publishers and RDM
PDF
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
PPTX
Journal Data Sharing Policies rscd2018
PPTX
Rscd 2018 Journal policies - natasha simons
PDF
On community-standards, data curation and scholarly communication" Stanford M...
PDF
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
PDF
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
Facilitating good research data management practice as part of scholarly publ...
Scientific Data and peer review session at Dryad event, May 2015
What role can publishers play in the open data ecosystem?
Data peer review workshop
High quality data publications: drives and needs - Sansone, BDebate, 12 Nov 2014
Gaining credit for sharing research data
Data publication: Discover, Explore, Visualise
Data sharing as part of the research workflow
NC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
Open Access Week - Oxford, 20-24 Oct 2014
How to share useful data
A National Approach to Open Data in Ireland: Publishers and Research Data Man...
Rebecca Grant - Publishers and RDM
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
Journal Data Sharing Policies rscd2018
Rscd 2018 Journal policies - natasha simons
On community-standards, data curation and scholarly communication" Stanford M...
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014

More from Varsha Khodiyar (19)

PDF
Digital transformation to enable a FAIR approach for health data science
PDF
Lessons from the UK: Data access, patient trust & real-world impact with heal...
PDF
COVID-19 variants, vaccines and tests
PDF
COVID-19 variants and vaccines
PDF
Data citation and sharing during article publication
PDF
The importance of research data repositories
PDF
Five essentials factors for unlocking the potential for Open Research Data
PPTX
New approaches to data management: supporting FAIR data sharing at Springer N...
PPTX
The value of data curation as part of the publishing process
PDF
Practical challenges for researchers in data sharing
PDF
Update from Data policy standardisation and implementation IG
PDF
Peer Reviewing Data: experiences from a data journal
PPTX
Data Publishing and Institutional Repositories
PPTX
Clinical Data Publishing at Scientific Data
PPTX
Privacy and Publication: challenges and opportunities for clinical data
PPTX
Why should researchers care about data curation?
PPTX
Share & Flourish workshop, Leiden, August 2014
PPTX
Open science: your questions answered
PPTX
Open for science to support replication
Digital transformation to enable a FAIR approach for health data science
Lessons from the UK: Data access, patient trust & real-world impact with heal...
COVID-19 variants, vaccines and tests
COVID-19 variants and vaccines
Data citation and sharing during article publication
The importance of research data repositories
Five essentials factors for unlocking the potential for Open Research Data
New approaches to data management: supporting FAIR data sharing at Springer N...
The value of data curation as part of the publishing process
Practical challenges for researchers in data sharing
Update from Data policy standardisation and implementation IG
Peer Reviewing Data: experiences from a data journal
Data Publishing and Institutional Repositories
Clinical Data Publishing at Scientific Data
Privacy and Publication: challenges and opportunities for clinical data
Why should researchers care about data curation?
Share & Flourish workshop, Leiden, August 2014
Open science: your questions answered
Open for science to support replication

Recently uploaded (20)

PPTX
Models of Eucharyotic Chromosome Dr. Thirunahari Ugandhar.pptx
PDF
software engineering for computer science
PDF
LEUCEMIA LINFOBLÁSTICA AGUDA EN NIÑOS. Guías NCCN 2020-desbloqueado.pdf
PPTX
23ME402 Materials and Metallurgy- PPT.pptx
PPTX
Cutaneous tuberculosis Dermatology
PPTX
Personality for guidance related to theories
PPT
Chapter 52 introductory biology course Camp
PDF
Glycolysis by Rishikanta Usham, Dhanamanjuri University
PDF
CoSEE-Cat:AComprehensiveSolarEnergeticElectronevent Catalogueobtainedfromcomb...
PPTX
The Electromagnetism Wave Spectrum. pptx
PPT
INSTRUMENTAL ANALYSIS (Electrochemical processes )-1.ppt
PPTX
SCIENCE 5 Q2 WEEK 1 SKELETAL, INTEGUMENTARY AND DIGESTIVE SYSTEM
PPTX
Chapter 7 HUMAN HEALTH AND DISEASE, NCERT
PPT
dcs-computertraningbasics-170826004702.ppt
PDF
SOCIAL PSYCHOLOGY_ CHAPTER 2.pdf- the self in a social world
PPTX
ELS 2ND QUARTER 1 FOR HUMSS STUDENTS.pptx
PDF
Pharmacokinetics Lecture_Study Material.pdf
PDF
TOPIC-1-Introduction-to-Bioinformatics_for dummies
PDF
Human Anatomy (Anatomy and Physiology A)
PDF
Telemedicine: Transforming Healthcare Delivery in Remote Areas (www.kiu.ac.ug)
Models of Eucharyotic Chromosome Dr. Thirunahari Ugandhar.pptx
software engineering for computer science
LEUCEMIA LINFOBLÁSTICA AGUDA EN NIÑOS. Guías NCCN 2020-desbloqueado.pdf
23ME402 Materials and Metallurgy- PPT.pptx
Cutaneous tuberculosis Dermatology
Personality for guidance related to theories
Chapter 52 introductory biology course Camp
Glycolysis by Rishikanta Usham, Dhanamanjuri University
CoSEE-Cat:AComprehensiveSolarEnergeticElectronevent Catalogueobtainedfromcomb...
The Electromagnetism Wave Spectrum. pptx
INSTRUMENTAL ANALYSIS (Electrochemical processes )-1.ppt
SCIENCE 5 Q2 WEEK 1 SKELETAL, INTEGUMENTARY AND DIGESTIVE SYSTEM
Chapter 7 HUMAN HEALTH AND DISEASE, NCERT
dcs-computertraningbasics-170826004702.ppt
SOCIAL PSYCHOLOGY_ CHAPTER 2.pdf- the self in a social world
ELS 2ND QUARTER 1 FOR HUMSS STUDENTS.pptx
Pharmacokinetics Lecture_Study Material.pdf
TOPIC-1-Introduction-to-Bioinformatics_for dummies
Human Anatomy (Anatomy and Physiology A)
Telemedicine: Transforming Healthcare Delivery in Remote Areas (www.kiu.ac.ug)

Preparing your data for sharing and publishing

  • 1. Preparing your data for sharing and publishing Varsha Khodiyar, PhD MRC Cognition and Brain Sciences Unit Open Science Day 20.11.2018 Howchameleonschangecolour
  • 2. 1 7719 respondents White paper available from https://doi.org/10.6084/m9.figshare.5975011 Survey data available from https://doi.org/10.6084/m9.figshare.5971387 What are researchers concerned about when sharing data?
  • 3. 2
  • 4. 3
  • 5. 44 How to organise your data
  • 6. 5 Make sure your data are well organised • Data files and folders labelled in an understandable way. • Data files and folders organised in a logical, easy-to- follow manner. • Any acronyms used for data file/folder names clearly defined, ideally in a README file. • Data files in a format that are easy for others to reuse / use the standard format used by your discipline.
  • 7. 6 Increasing reproducibility • Include any additional information needed to understand the data, methods, parameters, e.g. which instrument (make and model) was used to measure blood carbon dioxide levels? • Include availability statements for any code that was used to view, parse or analyse the data, in support of the conclusions.
  • 8. 7 Help to organise your data is available Springer Nature Research Data Support Researchers submit their data files securely The Research Data team curates the data and metadata The data are published and linked to the author’s paper More information is available on our website here: http://www.springernature.com/gb/group/data- policy/data-support-services
  • 9. 8 No one other than the creator can access the data, or even knows that it exists Before data curation: a researcher’s dataset in a desktop folder The dataset is stored as an Excel file in a desktop folder The file title is not comprehensible to anyone but the creator No description or keywords available
  • 10. 9 Before curation begins Once received, we check to make sure that the dataset is suitable for our curation services. Multiple files in any format are accepted. After making these checks, we begin the curation process. If necessary we may recommend that the dataset is split into smaller groups or collections. Pre-curation data checks:  The data aren’t sensitive  The data don’t include direct or indirect human identifiers  The data shouldn’t be in a community repository
  • 11. 10 After Springer Nature Research Data Support Working with the researcher’s manuscript or published paper, we draft a comprehensive metadata record for the dataset which is sent to the researcher for approval before being published. Embargoes can be applied if necessary. The curated dataset will be published with its own metadata record which includes rich descriptive information, reuse conditions, licence, DOI, metrics and keywords (this example is https://doi.org/10.6084/m9.figshare.5259 415)
  • 12. 1111 Choosing a repository to store data
  • 13. 12 Selecting a repository for your data Considerations: 1. Is there a discipline-specific repository for the type of data you have generated? 2. Will access to the data need to be controlled? 3. If no discipline-specific repository is available for the data, does your funder or institute mandate deposition to a particular repository?
  • 14. 13 Indexing services Curated lists Sources to help choose a data repository NEW! Tools to help select repositories www.nature.com/sdata/data-policies/repositories https://repositoryfinder.test.datacite.org/
  • 16. 15 • Consider an appropriate patient consent framework  Consent to use data in current study  Consent to use data for future research  Consent to share data for use by other research groups • Don’t collect more than you need Collecting sensitive data
  • 17. 16 • Remove direct identifiers • Aggregate indirect identifiers into groups where possible • Anonymization or de-identification? • Use controlled access repositories, and consider:  Data use agreement?  Data access conditions? Sharing sensitive data
  • 18. 1717 Scholarly credit for generating and sharing research data
  • 19. 18 Data Journals at Springer Nature www.nature.com/scientificdata https://bmcresnotes.biomedcentral.com Data Descriptor Open access Sound science Emphasis on enabling data reuse Data peer review Data Note Open access Sound science Short format
  • 20. 19 Scientific Data, a Nature Research journal Data Descriptor Primary article type; sound science and facilitates data reuse Analysis New analyses or meta- analyses of existing data Article Original reports on advances in data sharing & reuse Comment Announcements of broad interest; usually invited www.nature.com/scientificdata
  • 21. 20 Under the hood of a Data Descriptor • Context for data generation (background) • How was data generated? • How was data processed? • Where is the data? • Synthesis • Analysis • Conclusions
  • 22. 21 Data peer review www.nature.com/sdata/policies/for-referees Experimental Rigor and Technical Data Quality Were data produced in a sound manner? Technical quality of data – appropriate statistical analyses? Experimental rigor - appropriate depth, coverage? Completeness of the Description Sufficient detail to allow others to reproduce these steps? Sufficient detail to allow others to reuse this data? Consistent with relevant minimum reporting standards? Integrity of the Data Files and Repository Record Do data files appear complete and match manuscript descriptions? Are data archived to the most appropriate repository?
  • 23. 22 What types of data can be published? Decades old dataset Standalone dataset Data that has been used in an analysis article Large consortium dataset Data from a single experiment Any data that the researcher finds valuable and that others might find useful too Data associated with a high impact analysis article
  • 24. 23 When can a data paper be published? After data analysis has been published Before analysis has been published Authors not intending to analyse data Data papers can be submitted and published at any point in the research workflow, i.e. whenever it makes most sense for your data After data analysis has been published Before the analysis has been published Publication alongside analysis article
  • 25. 2424 Still unsure about research data?
  • 26. 25 What does research data training offer? • Directly addresses the main challenges of data sharing • Part of the Nature Research Academies, offering trusted quality and value • A unique perspective and trusted experience within the realm of research data • Training for both researchers & information professionals, appropriate for all levels • Courses are customised to meet your needs, and are brought to you (and your researchers) Springer Nature Research Data Training Source: https://doi.org/10.6084/m9.figshare.5975011
  • 27. 26 Queries are answered within two business days Run by members of the Springer Nature Research Data team Expertise in data curation and management, archiving and digital preservation, copyright and licensing, Open Access publishing Always encourage best practices, e.g. the use of community repositories for specific data types Email: [email protected] http://www.springernature.com/gp/group/data-policy/helpdesk Springer Nature Research Data Helpdesk
  • 28. 2727 The story behind the image How chameleons change colour Chameleons are well known for their potential to change colour but recent research on panther chameleons is the first to find two layers of crystal containing cells, each with a potentially different purpose. Researchers from the University of Geneva have speculated that the deeper crystal containing cells may help with the regulation of temperature, whilst the more superficial layer of colour changing cells could be responsible for camouflage or mating displays. Thank you for listening Varsha Khodiyar, PhD Data Curation Manager, Springer Nature (Data Curation Editor, Scientific Data) [email protected]