SlideShare a Scribd company logo
Dan Crane
Research Support Librarian
library-research-support@open.ac.uk
Data Sharing:
How, what and why?
6th February 2018
Overview of the workshop
• Data sharing policies
• Benefits of data sharing
• Data repositories
• Preparing data for sharing
• Re-using data
• Questions/further information
Rufus Pollock, Cambridge University and Open
Knowledge Foundation, 2008
“The coolest thing to do with
your data will be thought of by
someone else.”
Why should you share your data?
Policies: funders…
Since 2017, all Horizon 2020 projects are part of the Open
Research Data Pilot by default
All publications after May 2015 should have a statement
describing how to access underlying data. EPSRC have
said they will check.
Researchers now required to prepare to share data and
other outputs of their work, such as original software and
research materials like antibodies, cell lines or
reagents.
Why should you share your data?
Policies: funders…
Why should you share your data?
Policies: publishers…
“An inherent principle of publication is that others should be able to
replicate and build upon the authors' published claims. A condition of
publication in a Nature journal is that authors are required to make
materials, data, code, and associated protocols promptly available
to readers without undue qualifications. Any restrictions on the
availability of materials or information must be disclosed to the editors at
the time of submission. Any restrictions must also be disclosed in
the submitted manuscript.”
http://www.nature.com/authors/policies/availability.html
Why should you share your data?
Policies: publishers…
“PLOS journals require authors to make all data underlying the findings
described in their manuscript fully available without restriction, with rare
exception.
When submitting a manuscript online, authors must provide a Data
Availability Statement describing compliance with PLOS's policy. If the
article is accepted for publication, the data availability statement will be
published as part of the final article.
Refusal to share data and related metadata and methods in accordance
with this policy will be grounds for rejection…”
http://journals.plos.org/plosone/s/data-availability
Why should you share your data?
Policies: publishers…
“In keeping with OU principles of openness,
it is expected that research data will be open
and accessible to other researchers, as soon
as appropriate and verifiable, subject to the
application of appropriate safeguards
relating to the sensitivity of the data and
legal and commercial requirements.”
OU Research Data Management Policy, November 2016
http://www.open.ac.uk/library-research-support/sites/www.open.ac.uk.library-
research-support/files/files/Open-University-Research-Data-Management-Policy.pdf
Why should you share your data?
Policies: Open University…
“Good data management is
fundamental to all stages of the
research process and should be
established at the outset.”
“Open access to research data is an
enabler of high quality research, a
facilitator of innovation and
safeguards good research practice.”
Concordat on Open Research Data
http://www.rcuk.ac.uk/documents/documents/concordatonopenresearchdata-pdf/
Why should you share your data?
A shared goal
Why should you share your data?
Innovation
Why should you share your data?
Research integrity
Why should you share your data?
More citations
• “As open as possible, as closed as necessary”
Why should you share your data?
Exemptions
What do you need to share?
• Raw data
• Derived data
• Code
• Methods
What are research data in your context?
What would others need to understand your research?
Open Research Data Online
(ORDO)
Online data sharing services
• Figshare
• Zenodo
• CKAN DataHub
• Mendeley Data
Directories
• re3data
Funders’ repository services
• UK Data Service ReShare
• NERC data centres
How to share
Data repositories
https://ou.figshare.com
ORDO (Open Research Data Online)
How to share
Data statements
• "All data created during this research are openly available from
Lancaster University data archive at
http://dx.doi.org/10.17635/lancaster/researchdata/15.“
• "All data are provided in full in the results section / the supplementary
section of this paper.“
• "Crystal structures are available from the Cambridge Crystallographic
Data Centre (Identifier BATHRS) at http://dx.doi.org/10.15125/010203,
Microscopy images are openly available from Dryad at
http://dx.doi.org/10.17635/lancaster/researchdata/1.“
Examples taken from Lancaster University: http://www.lancaster.ac.uk/library/rdm/what-is-rdm/preserve-and-share/data-access-statements/
Preparing data for sharing
Metadata/documentation
“...make sure that data are fully
described, so that consumers have
sufficient information to understand
their strengths, weaknesses,
analytical limitations, and security
requirements as well as how to
process the data...”
G8 Open Data Charter (2013)
https://www.gov.uk/government/publications/open-data-
charter/g8-open-data-charter-and-technical-annex
Preparing data for sharing
Metadata/documentation
What do others need to understand your data?
Embedded documentation
• code, field and label
descriptions
• descriptive headers or
summaries
• recording information in
the Document Properties
function of a file
(Microsoft)
Supporting documentation
• Working papers or
laboratory books
• Questionnaires or
interview guides
• Final project reports and
publications
• Catalogue metadata
Preparing data for sharing
File formats
• Unencrypted
• Uncompressed
• Non-proprietary/patent-encumbered
• Open, documented standard
• Standard representation (ASCII, Unicode)
Type Recommended Avoid for data sharing
Tabular data CSV, TSV, SPSS portable Excel
Text Plain text, HTML, RTF
PDF/A only if layout matters
Word
Media Container: MP4, Ogg
Codec: Theora, Dirac, FLAC
Quicktime
H264
Images TIFF, JPEG2000, PNG GIF, JPG
Structured data XML, RDF RDBMS
Further examples: http://www.data-archive.ac.uk/create-manage/format/formats-table
Re-using data
Consider...
• Citation
• Purpose
• Discovery
• Access
• Cost
• Licensing
Prepare for...
• Data cleansing
• Data
interpretation
difficulties
• Data
disappearance
Where to look...
• Disciplinary
data archives
• Re3data
• Datacite
• British Library
• Data access
statements
Library Services
How we can help
• Open Research Data Online (ORDO)
• Help with Data Management Plans and consent forms
• Advice on preparation of data for sharing
• Data catalogue on ORO
• Online guidance
• Enquiries
Email: library-research-
support@open.ac.uk
Useful links
• The OU Library Research Support website: http://www.open.ac.uk/library-
research-support/research-data-management
• Open Research Data Online (ORDO): https://ou.figshare.com
• Digital Curation Centre: http://www.dcc.ac.uk/
• DMP Online: https://dmponline.dcc.ac.uk/
• UK Data Archive: http://www.data-archive.ac.uk/
• MANTRA: http://datalib.edina.ac.uk/mantra/
• The Orb: http://open.ac.uk/blogs/the_orb
Questions?
1. Sharing your data isn’t just about compliance
2. Select what data to share
3. Good metadata enables re-use
3 take home points...
Image credits
Unless otherwise stated, all images are by
Jørgen Stamp at http://www.digitalbevaring.dk

More Related Content

Data sharing: How, what and why?

  • 1. Dan Crane Research Support Librarian library-research-support@open.ac.uk Data Sharing: How, what and why? 6th February 2018
  • 2. Overview of the workshop • Data sharing policies • Benefits of data sharing • Data repositories • Preparing data for sharing • Re-using data • Questions/further information
  • 3. Rufus Pollock, Cambridge University and Open Knowledge Foundation, 2008 “The coolest thing to do with your data will be thought of by someone else.”
  • 4. Why should you share your data? Policies: funders…
  • 5. Since 2017, all Horizon 2020 projects are part of the Open Research Data Pilot by default All publications after May 2015 should have a statement describing how to access underlying data. EPSRC have said they will check. Researchers now required to prepare to share data and other outputs of their work, such as original software and research materials like antibodies, cell lines or reagents. Why should you share your data? Policies: funders…
  • 6. Why should you share your data? Policies: publishers…
  • 7. “An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. A condition of publication in a Nature journal is that authors are required to make materials, data, code, and associated protocols promptly available to readers without undue qualifications. Any restrictions on the availability of materials or information must be disclosed to the editors at the time of submission. Any restrictions must also be disclosed in the submitted manuscript.” http://www.nature.com/authors/policies/availability.html Why should you share your data? Policies: publishers…
  • 8. “PLOS journals require authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. When submitting a manuscript online, authors must provide a Data Availability Statement describing compliance with PLOS's policy. If the article is accepted for publication, the data availability statement will be published as part of the final article. Refusal to share data and related metadata and methods in accordance with this policy will be grounds for rejection…” http://journals.plos.org/plosone/s/data-availability Why should you share your data? Policies: publishers…
  • 9. “In keeping with OU principles of openness, it is expected that research data will be open and accessible to other researchers, as soon as appropriate and verifiable, subject to the application of appropriate safeguards relating to the sensitivity of the data and legal and commercial requirements.” OU Research Data Management Policy, November 2016 http://www.open.ac.uk/library-research-support/sites/www.open.ac.uk.library- research-support/files/files/Open-University-Research-Data-Management-Policy.pdf Why should you share your data? Policies: Open University…
  • 10. “Good data management is fundamental to all stages of the research process and should be established at the outset.” “Open access to research data is an enabler of high quality research, a facilitator of innovation and safeguards good research practice.” Concordat on Open Research Data http://www.rcuk.ac.uk/documents/documents/concordatonopenresearchdata-pdf/ Why should you share your data? A shared goal
  • 11. Why should you share your data? Innovation
  • 12. Why should you share your data? Research integrity
  • 13. Why should you share your data? More citations
  • 14. • “As open as possible, as closed as necessary” Why should you share your data? Exemptions
  • 15. What do you need to share? • Raw data • Derived data • Code • Methods What are research data in your context? What would others need to understand your research?
  • 16. Open Research Data Online (ORDO) Online data sharing services • Figshare • Zenodo • CKAN DataHub • Mendeley Data Directories • re3data Funders’ repository services • UK Data Service ReShare • NERC data centres How to share Data repositories
  • 18. How to share Data statements • "All data created during this research are openly available from Lancaster University data archive at http://dx.doi.org/10.17635/lancaster/researchdata/15.“ • "All data are provided in full in the results section / the supplementary section of this paper.“ • "Crystal structures are available from the Cambridge Crystallographic Data Centre (Identifier BATHRS) at http://dx.doi.org/10.15125/010203, Microscopy images are openly available from Dryad at http://dx.doi.org/10.17635/lancaster/researchdata/1.“ Examples taken from Lancaster University: http://www.lancaster.ac.uk/library/rdm/what-is-rdm/preserve-and-share/data-access-statements/
  • 19. Preparing data for sharing Metadata/documentation “...make sure that data are fully described, so that consumers have sufficient information to understand their strengths, weaknesses, analytical limitations, and security requirements as well as how to process the data...” G8 Open Data Charter (2013) https://www.gov.uk/government/publications/open-data- charter/g8-open-data-charter-and-technical-annex
  • 20. Preparing data for sharing Metadata/documentation What do others need to understand your data? Embedded documentation • code, field and label descriptions • descriptive headers or summaries • recording information in the Document Properties function of a file (Microsoft) Supporting documentation • Working papers or laboratory books • Questionnaires or interview guides • Final project reports and publications • Catalogue metadata
  • 21. Preparing data for sharing File formats • Unencrypted • Uncompressed • Non-proprietary/patent-encumbered • Open, documented standard • Standard representation (ASCII, Unicode) Type Recommended Avoid for data sharing Tabular data CSV, TSV, SPSS portable Excel Text Plain text, HTML, RTF PDF/A only if layout matters Word Media Container: MP4, Ogg Codec: Theora, Dirac, FLAC Quicktime H264 Images TIFF, JPEG2000, PNG GIF, JPG Structured data XML, RDF RDBMS Further examples: http://www.data-archive.ac.uk/create-manage/format/formats-table
  • 22. Re-using data Consider... • Citation • Purpose • Discovery • Access • Cost • Licensing Prepare for... • Data cleansing • Data interpretation difficulties • Data disappearance Where to look... • Disciplinary data archives • Re3data • Datacite • British Library • Data access statements
  • 23. Library Services How we can help • Open Research Data Online (ORDO) • Help with Data Management Plans and consent forms • Advice on preparation of data for sharing • Data catalogue on ORO • Online guidance • Enquiries Email: library-research- support@open.ac.uk
  • 24. Useful links • The OU Library Research Support website: http://www.open.ac.uk/library- research-support/research-data-management • Open Research Data Online (ORDO): https://ou.figshare.com • Digital Curation Centre: http://www.dcc.ac.uk/ • DMP Online: https://dmponline.dcc.ac.uk/ • UK Data Archive: http://www.data-archive.ac.uk/ • MANTRA: http://datalib.edina.ac.uk/mantra/ • The Orb: http://open.ac.uk/blogs/the_orb
  • 26. 1. Sharing your data isn’t just about compliance 2. Select what data to share 3. Good metadata enables re-use 3 take home points...
  • 27. Image credits Unless otherwise stated, all images are by Jørgen Stamp at http://www.digitalbevaring.dk