SlideShare a Scribd company logo
S t u a r t M a c d o n a l d
A s s o c i a t e Re s e a rc h D a ta L i b ra r i a n
E D I N A & D a ta L i b ra r y
st u a r t . m a c d o n a l d @ e d . a c . u k
I n t r o d u c t i o n t o R D M W o r k s h o p
S c h o o l o f G e o s c i e n c e s
2 N o v . 2 0 1 5 ,
Introduction to RDM for
Geoscience PhD students
Defining research data
 Research data are collected, observed or created, for the
purposes of analysis to produce and validate original research
results.
 Data can also be created by researchers for one purpose and
used by another set of researchers at a later date for a
completely different research agenda.
 Digital data can be:
 created in a digital form ('born digital')
 converted to a digital form (digitised)
Types of research data
 Instrument measurements*
 Experimental observations*
 Still images, video and audio*
 Text documents, spreadsheets,
databases*
 Quantitative data (e.g. household
survey data)*
 Survey results & interview transcripts*
 Simulation data, models & software
 Slides, artefacts, specimens, samples*
 Sketches, diaries, lab notebooks …*
* Potentially geo-referenced
Research Data Management (RDM)
 Research data management is caring for, facilitating
access to, preserving and adding value to research
data throughout their lifecycle.
 Data management is one of the essential areas of
responsible conduct of research.
 Good research needs good data !!
Why we give you this course
in your 1st year
Good research data management practices = more
efficient, less stressful research /PhD
Activities involved in RDM
 Data Management Planning
 Creating data
 Documenting data
 Accessing / using data
 Selection and appraisal
 Storage and backup
 Sharing data
 Preserving data
Re-use
Why manage your data?
 To meet funder / university / industry requirements.
 So you can find and understand it when needed.
 To avoid unnecessary duplication.
 To increase efficiency
 To validate results if required.
 So your research is visible and has impact.
 To get credit when others cite your work.
Drivers of RDM
“Publicly funded research data are a public good, produced in the
public interest, which should be made openly available with
as few restrictions as possible in a timely and responsible
manner that does not harm intellectual property.”
RCUK Common Principles on Data Policy
http://www.rcuk.ac.uk/research/datapolicy/
Funder requirements
 Funders are increasingly requiring researchers to meet certain
data management criteria.
 When applying for funding, you need to submit a technical or
data management plan.
 You are expected to make your data publicly available where
appropriate at the end of your project.
 Tip: familiarise yourselves with your funders’ demands with
respect to data management!
http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
What do Funders want?
University’s RDM Policy
 University of Edinburgh is one of
the first Universities in UK who
adopted a policy for managing
research data:
http://www.ed.ac.uk/is/research-data-policy
 The policy was approved by the
University Court on 16 May 2011.
 It’s acknowledged that this is an
aspirational policy and that
implementation will take some
years.
http://datashare.is.ed.ac.uk/
www.ed.ac.uk/is/data-managementhttp://datablog.is.ed.ac.uk/http://datalib.edina.ac.uk/mantra/
DataStore
https://dmponline.dcc.ac.uk/ http://edin.ac/1OF8Auq
www.ed.ac.uk/is/datasync
Data vault
Ready by mid-2016
www.ed.ac.uk/is/research-data-policy
Data catalogue
in PURE
RDM Services at the
University of Edinburgh
DataShare
What is a Data Management Plan?
DMPs are written at the start of a project to define:
 What data will be collected or created?
 How data will be documented and described?
 Where data will be stored?
 Who will be responsible for data security and backup?
 Which data will be shared and/or preserved?
 How data will be shared and with whom?
DMPs are often submitted as part of grant applications, but are useful in
their own right whenever you are creating data.
DMPonline
Free and open web-based tool to help
researchers write plans:
https://dmponline.dcc.ac.uk/
It features:
 Templates based on different
funder requirements
 Tailored guidance (disciplinary,
school etc.)
 Customised exports to a variety of
formats
 Ability to share DMPs with others
DMPonline screencast:
http://www.screenr.com/PJHN
DataStore
• Edinburgh DataShare is the University’s OA multi-disciplinary data repository hosted by the
Data Library: http://datashare.is.ed.ac.uk
• Assists researchers who want to:
• share their data,
• get credit for data publication
• preserve their data for the long-term (DOI, licence, citation)
• It can help researchers comply with funder & institutional requirements.
DataShare
 Facility to store data that are actively used in current research activities
 Provision: 2.3PB total storage; 0.5 TB (500GB) per researcher, PGR upwards
 Up to 0.25TB of each allocation can be used to create “shared” storage
 Cost of extra storage: £200 per TB per year= 1TB primary storage, 10 days online file history,
60 days backup, DR copy
 'Dropbox-like’ file-hosting service for non-sensitive data:
ww.ed.ac.uk/is/datasync
 Allows sharing and synchronisation of data.
 Share using local clients or web URL with colleagues
anywhere.
 20GB free storage or map to personal / group data on
DataStore as required.
 Using the ownCloud open source application.
DataVault
PURE: Describing your data
 Safe, private, store of data that is only accessible by the data creator or their
representative
 Secure storage: File security; Storage security; Additional security: encryption
 Long term assurance; Automatic versioning
 Being developed as a joint project with the Univ. of Manchester and partly funded by Jisc.
Full version expected to be in place in 2016
 You can describe your datasets (creating metadata) in
PURE (datasets field): http://edin.ac/1OF8Auq
 Doing this will help your datasets to be discovered,
accessed, and reused as appropriate - ready to use.
External repositories
When choosing an external you should consider:
 Does their funder require data to be offered to a specific repository?
 Is the repository sustainable?
 How much will it cost? Are costs upfront or annual?
 How does the repository promote discoverablity?
 Does the repository record when data is accessed, downloaded, or cited so they
will get recognition for their work?
 What will be done with their data if the repository closes down?
NERC Data Centres
Central to the policy is that NERC-funded scientists must make their data openly available
within two years of collection and deposit it in a NERC data centre for long term
preservation.
The NERC Data Policy sets the RDM
ground rules that NERC funded
researchers must follow.
The Data Policy details a
commitment to support the long-
term access of environmental data
and also outlines roles and
responsibilities of those involved in
collecting and managing
environmental data.
GoGeo: http://www.gogeo.ac.uk/
The online Geodoc
Metadata Editor tool
allows users to create,
validate, edit, export
and import geospatial
metadata records.
RDM Support
• Introductory sessions on RDM: contactis.helpline@ed.ac.uk
IS.Helpline@ed.ac.uk for a session for your
School or subject group.
• RDM website: http://www.ed.ac.uk/is/data-
management
• RDM blog: http://datablog.is.ed.ac.uk
• RDM wiki:
https://www.wiki.ed.ac.uk/display/RDM/Rese
arch+Data+Management+Wiki
• Training sessions and workshops:
http://www.ed.ac.uk/schools-
departments/information-services/research-
support/data-management/rdm-training
Training: MANTRA online course
 MANTRA is an internationally
recognized self-paced online training
course developed here for PGR’s and
early career researchers in data
management issues.
 Anyone doing a research project will
benefit from at least some part of the
training (and you can pick and choose)
 Data handling exercises with open
datasets in 4 analytical packages: R,
SPSS, NVivo, ArcGIS
http://datalib.edina.ac.uk/mantra
Training: Tailored courses
 A range of training programmes on
research data management (RDM)
in the form of workshops, power
sessions, seminars and drop in
sessions to help researchers with
research data management issues
 http://www.ed.ac.uk/schools-
departments/information-
services/research-support/data-
management/rdm-training
 Creating a data management plan
for your grant application
 Good practice in Research Data
Management
 Handling data using SPSS
 Visualising data using ArcGIS / QGIS
 Registration via MyED:
http://edin.ac/1kRMPv3

Data Library
• Data support & consultancy service
• Help with obtaining and using data (incl. spatial data
products)
• Survey and Documentation Analysis (SDA)
• Advice on Research Data Management
• All queries regarding DataShare
Email: datalib@ed.ac.uk
Website: http://www.ed.ac.uk/is/data-library
Questions?
What is a Data Management Plan?
DMPs are written at the start of a project to define:
 What data will be collected or created?
 How data will be documented and described?
 Where data will be stored?
 Who will be responsible for data security and backup?
 Which data will be shared and/or preserved?
 How data will be shared and with whom?
Getting Started with a DMP
 Gain an understanding of terminology & issues (MANTRA).
• Gain understanding of your project/community
• Supervisor and colleagues
• People in your School, i.e. IT Officers, Graduate Research Coordinator...
 Talk to your supervisor about data authorship, IP, licensing, legislation,
disclosure, ethics, policies
 Use a research data planning checklist
• http://www.dcc.ac.uk/resources/data-management-plans/checklist
 Remember it is never finished! Review it regularly through the course of
your research
Top tips
 Keep it simple, short and specific.
 Don't spend too much time. What you don't know leave gaps,
investigate, fill in later
 Avoid jargon
 Seek advice - consult and collaborate
 Base plans on available skills and support
 Make sure implementation is feasible
 Justify any resources or restrictions needed
Also see: http://www.youtube.com/watch?v=7OJtiA53-Fk
Writing a Data Management Plan

More Related Content

Introduction to RDM for Geoscience PhD Students

  • 1. S t u a r t M a c d o n a l d A s s o c i a t e Re s e a rc h D a ta L i b ra r i a n E D I N A & D a ta L i b ra r y st u a r t . m a c d o n a l d @ e d . a c . u k I n t r o d u c t i o n t o R D M W o r k s h o p S c h o o l o f G e o s c i e n c e s 2 N o v . 2 0 1 5 , Introduction to RDM for Geoscience PhD students
  • 2. Defining research data  Research data are collected, observed or created, for the purposes of analysis to produce and validate original research results.  Data can also be created by researchers for one purpose and used by another set of researchers at a later date for a completely different research agenda.  Digital data can be:  created in a digital form ('born digital')  converted to a digital form (digitised)
  • 3. Types of research data  Instrument measurements*  Experimental observations*  Still images, video and audio*  Text documents, spreadsheets, databases*  Quantitative data (e.g. household survey data)*  Survey results & interview transcripts*  Simulation data, models & software  Slides, artefacts, specimens, samples*  Sketches, diaries, lab notebooks …* * Potentially geo-referenced
  • 4. Research Data Management (RDM)  Research data management is caring for, facilitating access to, preserving and adding value to research data throughout their lifecycle.  Data management is one of the essential areas of responsible conduct of research.  Good research needs good data !!
  • 5. Why we give you this course in your 1st year Good research data management practices = more efficient, less stressful research /PhD
  • 6. Activities involved in RDM  Data Management Planning  Creating data  Documenting data  Accessing / using data  Selection and appraisal  Storage and backup  Sharing data  Preserving data Re-use
  • 7. Why manage your data?  To meet funder / university / industry requirements.  So you can find and understand it when needed.  To avoid unnecessary duplication.  To increase efficiency  To validate results if required.  So your research is visible and has impact.  To get credit when others cite your work.
  • 8. Drivers of RDM “Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner that does not harm intellectual property.” RCUK Common Principles on Data Policy http://www.rcuk.ac.uk/research/datapolicy/
  • 9. Funder requirements  Funders are increasingly requiring researchers to meet certain data management criteria.  When applying for funding, you need to submit a technical or data management plan.  You are expected to make your data publicly available where appropriate at the end of your project.  Tip: familiarise yourselves with your funders’ demands with respect to data management!
  • 11. University’s RDM Policy  University of Edinburgh is one of the first Universities in UK who adopted a policy for managing research data: http://www.ed.ac.uk/is/research-data-policy  The policy was approved by the University Court on 16 May 2011.  It’s acknowledged that this is an aspirational policy and that implementation will take some years.
  • 13. What is a Data Management Plan? DMPs are written at the start of a project to define:  What data will be collected or created?  How data will be documented and described?  Where data will be stored?  Who will be responsible for data security and backup?  Which data will be shared and/or preserved?  How data will be shared and with whom? DMPs are often submitted as part of grant applications, but are useful in their own right whenever you are creating data.
  • 14. DMPonline Free and open web-based tool to help researchers write plans: https://dmponline.dcc.ac.uk/ It features:  Templates based on different funder requirements  Tailored guidance (disciplinary, school etc.)  Customised exports to a variety of formats  Ability to share DMPs with others DMPonline screencast: http://www.screenr.com/PJHN
  • 15. DataStore • Edinburgh DataShare is the University’s OA multi-disciplinary data repository hosted by the Data Library: http://datashare.is.ed.ac.uk • Assists researchers who want to: • share their data, • get credit for data publication • preserve their data for the long-term (DOI, licence, citation) • It can help researchers comply with funder & institutional requirements. DataShare  Facility to store data that are actively used in current research activities  Provision: 2.3PB total storage; 0.5 TB (500GB) per researcher, PGR upwards  Up to 0.25TB of each allocation can be used to create “shared” storage  Cost of extra storage: £200 per TB per year= 1TB primary storage, 10 days online file history, 60 days backup, DR copy
  • 16.  'Dropbox-like’ file-hosting service for non-sensitive data: ww.ed.ac.uk/is/datasync  Allows sharing and synchronisation of data.  Share using local clients or web URL with colleagues anywhere.  20GB free storage or map to personal / group data on DataStore as required.  Using the ownCloud open source application.
  • 17. DataVault PURE: Describing your data  Safe, private, store of data that is only accessible by the data creator or their representative  Secure storage: File security; Storage security; Additional security: encryption  Long term assurance; Automatic versioning  Being developed as a joint project with the Univ. of Manchester and partly funded by Jisc. Full version expected to be in place in 2016  You can describe your datasets (creating metadata) in PURE (datasets field): http://edin.ac/1OF8Auq  Doing this will help your datasets to be discovered, accessed, and reused as appropriate - ready to use.
  • 18. External repositories When choosing an external you should consider:  Does their funder require data to be offered to a specific repository?  Is the repository sustainable?  How much will it cost? Are costs upfront or annual?  How does the repository promote discoverablity?  Does the repository record when data is accessed, downloaded, or cited so they will get recognition for their work?  What will be done with their data if the repository closes down?
  • 19. NERC Data Centres Central to the policy is that NERC-funded scientists must make their data openly available within two years of collection and deposit it in a NERC data centre for long term preservation. The NERC Data Policy sets the RDM ground rules that NERC funded researchers must follow. The Data Policy details a commitment to support the long- term access of environmental data and also outlines roles and responsibilities of those involved in collecting and managing environmental data.
  • 20. GoGeo: http://www.gogeo.ac.uk/ The online Geodoc Metadata Editor tool allows users to create, validate, edit, export and import geospatial metadata records.
  • 21. RDM Support • Introductory sessions on RDM: contactis.helpline@ed.ac.uk IS.Helpline@ed.ac.uk for a session for your School or subject group. • RDM website: http://www.ed.ac.uk/is/data- management • RDM blog: http://datablog.is.ed.ac.uk • RDM wiki: https://www.wiki.ed.ac.uk/display/RDM/Rese arch+Data+Management+Wiki • Training sessions and workshops: http://www.ed.ac.uk/schools- departments/information-services/research- support/data-management/rdm-training
  • 22. Training: MANTRA online course  MANTRA is an internationally recognized self-paced online training course developed here for PGR’s and early career researchers in data management issues.  Anyone doing a research project will benefit from at least some part of the training (and you can pick and choose)  Data handling exercises with open datasets in 4 analytical packages: R, SPSS, NVivo, ArcGIS http://datalib.edina.ac.uk/mantra
  • 23. Training: Tailored courses  A range of training programmes on research data management (RDM) in the form of workshops, power sessions, seminars and drop in sessions to help researchers with research data management issues  http://www.ed.ac.uk/schools- departments/information- services/research-support/data- management/rdm-training  Creating a data management plan for your grant application  Good practice in Research Data Management  Handling data using SPSS  Visualising data using ArcGIS / QGIS  Registration via MyED: http://edin.ac/1kRMPv3 
  • 24. Data Library • Data support & consultancy service • Help with obtaining and using data (incl. spatial data products) • Survey and Documentation Analysis (SDA) • Advice on Research Data Management • All queries regarding DataShare Email: datalib@ed.ac.uk Website: http://www.ed.ac.uk/is/data-library
  • 26. What is a Data Management Plan? DMPs are written at the start of a project to define:  What data will be collected or created?  How data will be documented and described?  Where data will be stored?  Who will be responsible for data security and backup?  Which data will be shared and/or preserved?  How data will be shared and with whom?
  • 27. Getting Started with a DMP  Gain an understanding of terminology & issues (MANTRA). • Gain understanding of your project/community • Supervisor and colleagues • People in your School, i.e. IT Officers, Graduate Research Coordinator...  Talk to your supervisor about data authorship, IP, licensing, legislation, disclosure, ethics, policies  Use a research data planning checklist • http://www.dcc.ac.uk/resources/data-management-plans/checklist  Remember it is never finished! Review it regularly through the course of your research
  • 28. Top tips  Keep it simple, short and specific.  Don't spend too much time. What you don't know leave gaps, investigate, fill in later  Avoid jargon  Seek advice - consult and collaborate  Base plans on available skills and support  Make sure implementation is feasible  Justify any resources or restrictions needed Also see: http://www.youtube.com/watch?v=7OJtiA53-Fk
  • 29. Writing a Data Management Plan

Editor's Notes

  1. National Academy of Science