SlideShare a Scribd company logo
EPSRC research data expectations and PURE
for datasets
Stuart Macdonald
Associate Data Librarian
University of Edinburgh
stuart.macdonald@ed.ac.uk
Information Services: Support for Enhancing Research Impact, JCMB, KB, 22 June 2016
• EDINA and Data Library are a division within Information Services (IS) of the
University of Edinburgh.
• EDINA is a Jisc-funded centre for digital expertise providing national online
resources for education and research.
• Data Library & Consultancy assists Edinburgh University users in the discovery,
access, use and management of research datasets.
• The Data Library forms part of the new Research Data Service – the culmination of
a 48 month RDM Roadmap (Phases 0 - 4) to implement the University’s RDM
Policy and develop a suite of RDM Services that map onto the research lifecycle.
Background
What is research data?
Research data is defined by EPSRC as recorded factual material commonly used in
the scientific community as necessary to validate research findings
Although the majority of such data is created in digital format, all research data is
included irrespective of the format in which it is created.
Note that EPSRC does not expect every piece of data produced during a project to
be retained – decisions about what to keep should be taken on a case by case
basis.
There is however a clear expectation that data which underpins published research
outputs will be retained and managed.
• EPSRC have introduced a policy framework addressing the management
and provision of access to publicly-funded research data.
• EPSRC Principal Investigators and the University must demonstrate to
EPSRC that their expectations are being met. The 9 expectations are
detailed at: http://www.epsrc.ac.uk/about/standards/researchdata/
• EPSRC began monitoring compliance on 1st May 2015 on a case-by-case
basis.
• If it judges sharing of research data is being obstructed then it reserves
the right to impose sanctions.
EPSRC policy framework on research
data:http://www.epsrc.ac.uk/about/standards/researchdata/impact/
The expectations arise from 7 core principles which align with
the core RCUK principles on data sharing, namely:
• EPSRC-funded research data is a public good produced in the public
interest and should be made openly available with as few restrictions as
possible in a timely and responsible manner.
• EPSRC recognises that there are legal, ethical and commercial constraints
on release of research data
• Sharing research data is an important contributor to the impact of publicly
funded research.
• EPSRC-funded researchers should be entitled to a limited period of
privileged access to the data they collect to allow them to write up and
publish their results.
• Data management policies and plans (and software management plans!)
should be in accordance with relevant standards and community best
practice and should exist for all data
• Sufficient metadata should be recorded and made openly available to
enable other researchers to understand the potential re-use of the data
for further research
• It is appropriate to use public funds to support the preservation and
management of publicly-funded research data.
What do PIs and researchers need to
know?
• All researchers or research students funded by EPSRC will be required to
comply with these expectations.
• Data that is not generated in digital format will be stored in a manner to
facilitate it being shared in the event of a valid request for access.
• A link to digital research data is expected to be included in the metadata.
• Where access to data is restricted published metadata should give the
reason and summarise the conditions which must be satisfied for access
to be granted.
• Key expectation 1: The data should be securely stored for at least 10 years
• Key expectation 2: An online record should be created within 12 months of
the data being generated that describes the research data and how to
access it.
• Key expectation 3: Published research papers should include a short
statement describing how and on what terms any supporting research
data may be accessed.
What do PIs and researchers need to do?
• Research data that underpins a publication must be stored safely and securely, and made
accessible.
• Data may already be managed by a trusted domain archive outside of the university, in
which case data may not need to be stored locally.
• If not then data must be stored in a suitable UoE storage solution. Minimal compliance is
achieved by having your data on DataStore and then making a secure copy of it into the
Data Vault (this service is currently in development).
• For those who wish to openly publish data (and a snapshot of their research software),
Edinburgh DataShare is the university’s open online repository of data produced by local
researchers (policies, licence, citation).
• Datasets added to DataShare will also be allocated a persistent identifier (DOI) for
citations.
Key expectation 1: store data securely
• Research staff are therefore expected to add a metadata record for any
EPSRC-funded research data, normally within 12 months of the data being
generated.
• The University is using PURE to record descriptive data (metadata) about
the research data in order to meet this expectation.
• To enter a new dataset description in PURE, click on the green ‘Add new’
button, and select ‘Dataset’.
• Once added to PURE via the dataset content type, the resulting record
should link to the funding source and also link to any associated
publications.
Key expectation 2: a record describing
the data must be freely available online
Data Catalogue in PURE
• If the dataset is available online, e.g. in DataShare, a persistent identifier
(or DOI) of that dataset should also be added.
• Where access to the data is to be restricted, the published dataset
metadata in PURE should give the reason and summarise the conditions
which must be satisfied to grant access.
• Dataset metadata added to PURE will be publicly accessible via the
Edinburgh Research Explorer subject to confidentiality and other such
restrictions.
• Dataset metadata from both PURE and DataShare is harvested by the pilot
UKRDDS - http://ckan.data.alpha.jisc.ac.uk/dataset
– Currently no interoperation between PURE and DataShare
– Work commencing to convert PURE v. 5 API into an OAI-PMH end-point
• This expectation could be satisfied by citing data in the published research with links to
the data or to supporting documentation that describes the data, how it may be
accessed and any constraints that may apply. Such links should be persistent URLs such
as DOIs.
• An example of a basic data citation would be of the form: ‘Creator (Publication Year):
Title. Publisher. DOI’ Further details can be found at:
https://www.datacite.org/services/cite-your-data.html
• If commercial, legal or ethical reasons exist to protect access to the data these should
be noted in a statement included in the published research paper. A simple direction to
interested parties to ‘contact the author for access’ may not be considered sufficient.
• The paper must also be made Open Access in PURE.
Key expectation 3: include a statement in
published papers under-pinned by EPSRC-funded data
Support
Implementation of the EPSRC Policy at Edinburgh is being supported by the Research
Data Service delivered by Information Services
For help about meeting this policy requirement contact:
• Email: IS.Helpline@ed.ac.uk with “Help with EPSRC data policy framework” in
your subject line.
• Email: PURE@ed.ac.uk if you have questions about PURE.
For help about research data management in general contact:
• Email: IS.Helpline@ed.ac.uk with “Help with Research Data Management in
general” in your subject line.
• Email: IS.Helpline@ed.ac.uk if you would like to arrange an RDM training or
awareness raising session.
Thanks!
• Data Library Services: http://www.ed.ac.uk/is/data-library
• EDINA: http://edina.ac.uk/
• University of Edinburgh RDM Roadmap: http://www.ed.ac.uk/information-services/about/strategy-
planning/rdm-roadmap
• University of Edinburgh RDM Policy: http://www.ed.ac.uk/information-services/about/policies-and-
regulations/research-data-policy
• DataStore: https://www.wiki.ed.ac.uk/x/Np9FD
• Edinburgh DataShare: http://datashare.is.ed.ac.uk/
• Data Catalogue in PURE: http://www.pure.ed.ac.uk
• Writing and using a software management plan - http://www.software.ac.uk/resources/guides/software-
management-plans

More Related Content

EPSRC research data expectations and PURE for datasets

  • 1. EPSRC research data expectations and PURE for datasets Stuart Macdonald Associate Data Librarian University of Edinburgh stuart.macdonald@ed.ac.uk Information Services: Support for Enhancing Research Impact, JCMB, KB, 22 June 2016
  • 2. • EDINA and Data Library are a division within Information Services (IS) of the University of Edinburgh. • EDINA is a Jisc-funded centre for digital expertise providing national online resources for education and research. • Data Library & Consultancy assists Edinburgh University users in the discovery, access, use and management of research datasets. • The Data Library forms part of the new Research Data Service – the culmination of a 48 month RDM Roadmap (Phases 0 - 4) to implement the University’s RDM Policy and develop a suite of RDM Services that map onto the research lifecycle. Background
  • 3. What is research data? Research data is defined by EPSRC as recorded factual material commonly used in the scientific community as necessary to validate research findings Although the majority of such data is created in digital format, all research data is included irrespective of the format in which it is created. Note that EPSRC does not expect every piece of data produced during a project to be retained – decisions about what to keep should be taken on a case by case basis. There is however a clear expectation that data which underpins published research outputs will be retained and managed.
  • 4. • EPSRC have introduced a policy framework addressing the management and provision of access to publicly-funded research data. • EPSRC Principal Investigators and the University must demonstrate to EPSRC that their expectations are being met. The 9 expectations are detailed at: http://www.epsrc.ac.uk/about/standards/researchdata/ • EPSRC began monitoring compliance on 1st May 2015 on a case-by-case basis. • If it judges sharing of research data is being obstructed then it reserves the right to impose sanctions.
  • 5. EPSRC policy framework on research data:http://www.epsrc.ac.uk/about/standards/researchdata/impact/
  • 6. The expectations arise from 7 core principles which align with the core RCUK principles on data sharing, namely: • EPSRC-funded research data is a public good produced in the public interest and should be made openly available with as few restrictions as possible in a timely and responsible manner. • EPSRC recognises that there are legal, ethical and commercial constraints on release of research data • Sharing research data is an important contributor to the impact of publicly funded research.
  • 7. • EPSRC-funded researchers should be entitled to a limited period of privileged access to the data they collect to allow them to write up and publish their results. • Data management policies and plans (and software management plans!) should be in accordance with relevant standards and community best practice and should exist for all data • Sufficient metadata should be recorded and made openly available to enable other researchers to understand the potential re-use of the data for further research • It is appropriate to use public funds to support the preservation and management of publicly-funded research data.
  • 8. What do PIs and researchers need to know? • All researchers or research students funded by EPSRC will be required to comply with these expectations. • Data that is not generated in digital format will be stored in a manner to facilitate it being shared in the event of a valid request for access. • A link to digital research data is expected to be included in the metadata. • Where access to data is restricted published metadata should give the reason and summarise the conditions which must be satisfied for access to be granted.
  • 9. • Key expectation 1: The data should be securely stored for at least 10 years • Key expectation 2: An online record should be created within 12 months of the data being generated that describes the research data and how to access it. • Key expectation 3: Published research papers should include a short statement describing how and on what terms any supporting research data may be accessed. What do PIs and researchers need to do?
  • 10. • Research data that underpins a publication must be stored safely and securely, and made accessible. • Data may already be managed by a trusted domain archive outside of the university, in which case data may not need to be stored locally. • If not then data must be stored in a suitable UoE storage solution. Minimal compliance is achieved by having your data on DataStore and then making a secure copy of it into the Data Vault (this service is currently in development). • For those who wish to openly publish data (and a snapshot of their research software), Edinburgh DataShare is the university’s open online repository of data produced by local researchers (policies, licence, citation). • Datasets added to DataShare will also be allocated a persistent identifier (DOI) for citations. Key expectation 1: store data securely
  • 11. • Research staff are therefore expected to add a metadata record for any EPSRC-funded research data, normally within 12 months of the data being generated. • The University is using PURE to record descriptive data (metadata) about the research data in order to meet this expectation. • To enter a new dataset description in PURE, click on the green ‘Add new’ button, and select ‘Dataset’. • Once added to PURE via the dataset content type, the resulting record should link to the funding source and also link to any associated publications. Key expectation 2: a record describing the data must be freely available online
  • 13. • If the dataset is available online, e.g. in DataShare, a persistent identifier (or DOI) of that dataset should also be added. • Where access to the data is to be restricted, the published dataset metadata in PURE should give the reason and summarise the conditions which must be satisfied to grant access. • Dataset metadata added to PURE will be publicly accessible via the Edinburgh Research Explorer subject to confidentiality and other such restrictions. • Dataset metadata from both PURE and DataShare is harvested by the pilot UKRDDS - http://ckan.data.alpha.jisc.ac.uk/dataset – Currently no interoperation between PURE and DataShare – Work commencing to convert PURE v. 5 API into an OAI-PMH end-point
  • 14. • This expectation could be satisfied by citing data in the published research with links to the data or to supporting documentation that describes the data, how it may be accessed and any constraints that may apply. Such links should be persistent URLs such as DOIs. • An example of a basic data citation would be of the form: ‘Creator (Publication Year): Title. Publisher. DOI’ Further details can be found at: https://www.datacite.org/services/cite-your-data.html • If commercial, legal or ethical reasons exist to protect access to the data these should be noted in a statement included in the published research paper. A simple direction to interested parties to ‘contact the author for access’ may not be considered sufficient. • The paper must also be made Open Access in PURE. Key expectation 3: include a statement in published papers under-pinned by EPSRC-funded data
  • 15. Support Implementation of the EPSRC Policy at Edinburgh is being supported by the Research Data Service delivered by Information Services For help about meeting this policy requirement contact: • Email: IS.Helpline@ed.ac.uk with “Help with EPSRC data policy framework” in your subject line. • Email: PURE@ed.ac.uk if you have questions about PURE. For help about research data management in general contact: • Email: IS.Helpline@ed.ac.uk with “Help with Research Data Management in general” in your subject line. • Email: IS.Helpline@ed.ac.uk if you would like to arrange an RDM training or awareness raising session.
  • 16. Thanks! • Data Library Services: http://www.ed.ac.uk/is/data-library • EDINA: http://edina.ac.uk/ • University of Edinburgh RDM Roadmap: http://www.ed.ac.uk/information-services/about/strategy- planning/rdm-roadmap • University of Edinburgh RDM Policy: http://www.ed.ac.uk/information-services/about/policies-and- regulations/research-data-policy • DataStore: https://www.wiki.ed.ac.uk/x/Np9FD • Edinburgh DataShare: http://datashare.is.ed.ac.uk/ • Data Catalogue in PURE: http://www.pure.ed.ac.uk • Writing and using a software management plan - http://www.software.ac.uk/resources/guides/software- management-plans

Editor's Notes

  1. First of its kind in the UK – primarily within the social sciences but not exclusively so 2 data library services – this morning I’ll concentrate on Edinburgh Datashare Advise on storing, versioning, documenting, formatting and anonymising researchers’ data for sharing or preserving for future use in an archive or repository