SlideShare a Scribd company logo
Edinburgh DataShare –  A DSpace Data Repository: Achievements and Aspirations   Stuart Macdonald  EDINA National Data Centre & Edinburgh University Data Library   Fedora-UK&I&EU Meeting, Oxford, 8 December Flickr CC Image  –  http://www.flickr.com/photos/laszlo-photo/1899390628/
EDINA and Data Library (EDL) together are a division within Information Services of the University of Edinburgh.  EDINA is a JISC-funded National Data Centre providing national online resources for education and research.  The Data Library service ( established in 1983)  assists Edinburgh University users in the discovery, access, use and management of research data assets. Building relationships with researchers via postgraduate teaching activities, IS Skills workshops, Research Data Management training and through traditional reference interviews. Edinburgh Datashare  is a digital repository of multi-disciplinary research datasets produced at the University of Edinburgh, hosted by the Data Library  Flickr CC Image  - http://www.flickr.com/photos/artimagesmarkcummins/300173269/
DISC-UK DataShare  Project DISC-UK DataShare Project – funded by JISC (March 2007 – March 2009) - a collaborative project which investigated the legal, cultural and technical issues surrounding research data sharing within UK tertiary education community  Explore new pathways to assist academics wishing to share their data over the Internet via Institutional Repositories (IRs) Policy-Making for Research Data in Repositories: A Guide -  Green, A., Macdonald, S. and R. Rice,  (2009).   Edinburgh DataShare digital repository – post project output – embedded into University Information Services policy Flickr CC Image  -  http://www.flickr.com/photos/ronin691/2285257955/
EDINA repository projects Edinburgh Repository Landscape ‘ Namedentity’ recognition’ Launch Jan. 2010 Repository Junction
Edinburgh Datashare – technical development DSpace v.1.51 with new theme aligned with University corporate style Embargo option - coded to restrict full data download with open metadata until specified date Open Data Commons License option (PDDL) Dynamically queries Geonames, a community generated  spatial database to ensure consistency in metadata entry for  Spatial Coverage field Extension to DSpace to record bitstream downloads in usage  statistics Implementation of JACS for assigning keyword to content Download All option (zip file of all item components) Citation field automatically generated based on specified metadata values Dublin Core-based metadata schema for datasets Flickr CC Image  –  http://www.flickr.com/photos/59414209@N00/3367225630/
Flickr CC Image  - http://www.flickr.com/photos/silvertje/3512611046/ Potential Next Steps: Semantification of content – OpenLinkedData & MIT’s SIMILIE project ( which aims to leverage and extend DSpace by supporting semantic web techniques) Visualisation tools – APIs that can be utilised within repository environment (Timeplot, Timeline) unlike existing open utilities Spatial analysis using open geo-browsers and mapping utilities Annotation, tagging, data citation Implementation of Deposit tool(s) – SWORD Streaming & viewing heterogeneous content
Research Data Management Training Developed on the back of the  Data Audit Framework  lead by HATII (Edinburgh being one of the pilot implementation projects) – developing online tools and methodologies which enable information specialists to engage and build relationships with researchers in order to ascertain the data holdings in a research school / department / group  Developing online and F-2F modules in conjunction with the Postgraduate Transferable Skills Unit -  http:// www.transkills.ed.ac.uk / and PG Essentials -  http:// www.transkills.ed.ac.uk/pgessentials.htm   Research data management guidance &  Data sharing and preservation –  http:// www.ed.ac.uk /is/data-management Flickr CC Image  – http://www.flickr.com/photos/sgrantarch/3563676104/ Engaging researchers through RDM exercises, teaching, reference interviews & funded-projects are crucial to the bottom-up development of tools, services and infrastructures meant to serve them and which ultimately improve both the efficiency and effectiveness of their research.
Image courtesy of the periodic table printmaking project –  http://azuregrackle.com/periodictable/table/58.html Engaging with the Research Community (1) Human Geography  (Professor Jane Jacobs) High Rise project  - a n interdisciplinary research  programme involving architects and geographers – “ It investigates two cases that encapsulate the varied fortunes  of the highrise experience: the UK, where the form is routinely  condemned, even demolished; and Singapore, where it is  embraced enthusiastically and continues to be built at greater heights and densities  -  http:// www.ace.ed.ac.uk/highrise / ” Content of the High Rise Digital Archive includes images, sound recordings, video, transcripts, architectural drawings Customise DataShare to allow streaming or embedded player  functionality, ingest heterogeneous content, employ multi-media  metadata standards Develop customised learning and teaching materials for deposit in JORUM Open
Engaging with the research community (2) Centre for Earth System Dynamics   (Dr Mike Mineter, Dr Magnus  Hagdorn) The  aim of the CESD is to  develop climate models  across SAGES  and other multi-disciplinary and international partners  to quantify and predict climate and environmental change Use DataShare to provide federated access via shibboleth for international partners for ‘working’ datasets (e.g. from the  Arctic Biosphere Atmosphere Coupling at Multiple Scales (ABACUS) programme) Store data via DataShare but also link to large datasets (c.1TB) stored on remote storage (Andrew File Systems, ECDF SAN) Content for ingestion includes large climate models/simulations, fieldwork and experimental output in proprietary formats Employ discipline-specific metadata standards to describe content
stuart.macdonald@ed.ac.uk  Creative Commons images from Flickr (unless otherwise stated) Flickr CC Image  –  http://www.flickr.com/photos/hippie/2556161507/ END Thank You

More Related Content

Fedora Oxford Dec09

  • 1. Edinburgh DataShare – A DSpace Data Repository: Achievements and Aspirations Stuart Macdonald EDINA National Data Centre & Edinburgh University Data Library Fedora-UK&I&EU Meeting, Oxford, 8 December Flickr CC Image – http://www.flickr.com/photos/laszlo-photo/1899390628/
  • 2. EDINA and Data Library (EDL) together are a division within Information Services of the University of Edinburgh. EDINA is a JISC-funded National Data Centre providing national online resources for education and research. The Data Library service ( established in 1983) assists Edinburgh University users in the discovery, access, use and management of research data assets. Building relationships with researchers via postgraduate teaching activities, IS Skills workshops, Research Data Management training and through traditional reference interviews. Edinburgh Datashare is a digital repository of multi-disciplinary research datasets produced at the University of Edinburgh, hosted by the Data Library Flickr CC Image - http://www.flickr.com/photos/artimagesmarkcummins/300173269/
  • 3. DISC-UK DataShare Project DISC-UK DataShare Project – funded by JISC (March 2007 – March 2009) - a collaborative project which investigated the legal, cultural and technical issues surrounding research data sharing within UK tertiary education community Explore new pathways to assist academics wishing to share their data over the Internet via Institutional Repositories (IRs) Policy-Making for Research Data in Repositories: A Guide - Green, A., Macdonald, S. and R. Rice, (2009). Edinburgh DataShare digital repository – post project output – embedded into University Information Services policy Flickr CC Image - http://www.flickr.com/photos/ronin691/2285257955/
  • 4. EDINA repository projects Edinburgh Repository Landscape ‘ Namedentity’ recognition’ Launch Jan. 2010 Repository Junction
  • 5. Edinburgh Datashare – technical development DSpace v.1.51 with new theme aligned with University corporate style Embargo option - coded to restrict full data download with open metadata until specified date Open Data Commons License option (PDDL) Dynamically queries Geonames, a community generated spatial database to ensure consistency in metadata entry for Spatial Coverage field Extension to DSpace to record bitstream downloads in usage statistics Implementation of JACS for assigning keyword to content Download All option (zip file of all item components) Citation field automatically generated based on specified metadata values Dublin Core-based metadata schema for datasets Flickr CC Image – http://www.flickr.com/photos/59414209@N00/3367225630/
  • 6. Flickr CC Image - http://www.flickr.com/photos/silvertje/3512611046/ Potential Next Steps: Semantification of content – OpenLinkedData & MIT’s SIMILIE project ( which aims to leverage and extend DSpace by supporting semantic web techniques) Visualisation tools – APIs that can be utilised within repository environment (Timeplot, Timeline) unlike existing open utilities Spatial analysis using open geo-browsers and mapping utilities Annotation, tagging, data citation Implementation of Deposit tool(s) – SWORD Streaming & viewing heterogeneous content
  • 7. Research Data Management Training Developed on the back of the Data Audit Framework lead by HATII (Edinburgh being one of the pilot implementation projects) – developing online tools and methodologies which enable information specialists to engage and build relationships with researchers in order to ascertain the data holdings in a research school / department / group Developing online and F-2F modules in conjunction with the Postgraduate Transferable Skills Unit - http:// www.transkills.ed.ac.uk / and PG Essentials - http:// www.transkills.ed.ac.uk/pgessentials.htm Research data management guidance & Data sharing and preservation – http:// www.ed.ac.uk /is/data-management Flickr CC Image – http://www.flickr.com/photos/sgrantarch/3563676104/ Engaging researchers through RDM exercises, teaching, reference interviews & funded-projects are crucial to the bottom-up development of tools, services and infrastructures meant to serve them and which ultimately improve both the efficiency and effectiveness of their research.
  • 8. Image courtesy of the periodic table printmaking project – http://azuregrackle.com/periodictable/table/58.html Engaging with the Research Community (1) Human Geography (Professor Jane Jacobs) High Rise project - a n interdisciplinary research programme involving architects and geographers – “ It investigates two cases that encapsulate the varied fortunes of the highrise experience: the UK, where the form is routinely condemned, even demolished; and Singapore, where it is embraced enthusiastically and continues to be built at greater heights and densities - http:// www.ace.ed.ac.uk/highrise / ” Content of the High Rise Digital Archive includes images, sound recordings, video, transcripts, architectural drawings Customise DataShare to allow streaming or embedded player functionality, ingest heterogeneous content, employ multi-media metadata standards Develop customised learning and teaching materials for deposit in JORUM Open
  • 9. Engaging with the research community (2) Centre for Earth System Dynamics (Dr Mike Mineter, Dr Magnus Hagdorn) The aim of the CESD is to develop climate models across SAGES and other multi-disciplinary and international partners to quantify and predict climate and environmental change Use DataShare to provide federated access via shibboleth for international partners for ‘working’ datasets (e.g. from the Arctic Biosphere Atmosphere Coupling at Multiple Scales (ABACUS) programme) Store data via DataShare but also link to large datasets (c.1TB) stored on remote storage (Andrew File Systems, ECDF SAN) Content for ingestion includes large climate models/simulations, fieldwork and experimental output in proprietary formats Employ discipline-specific metadata standards to describe content
  • 10. stuart.macdonald@ed.ac.uk Creative Commons images from Flickr (unless otherwise stated) Flickr CC Image – http://www.flickr.com/photos/hippie/2556161507/ END Thank You

Editor's Notes

  1. Hi – can you hear me OK? I’m going to talk to you this morning about the evolution Edinburgh Datashare data repository
  2. First of its kind in the UK – primarily within the social sciences but not exclusively so 2 data library services – this morning I’ll concentrate on Edinburgh Datashare Advise on storing, versioning, documenting, formatting and anonymising researchers’ data for sharing or preserving for future use in an archive or repository
  3. Depositing data, access and re-use, data quality requirements, metadata, confidentiality and disclosure, formatting, versioning, preservation, normalisation, back-up, storage, security
  4. IT Committee & library Committee currently reviewing Research Data Storage and Research Data Management – review current practice, develop options including costs, feasibility and risk analysis of actions/inactions in this field (Prof. Jeff Haywood – professor of educaiton and technology / Vice principal for kmnowledge management
  5. By-pass 1.5.2 bug fix – upgrade to 1.6 next year Registration streamlined using the University’s single sign-on Date range enabled to allow Time Period (dc:coverage) Anti-virus checking upon upload Joint Academic Classification of Subjects – HESA / UCAS classify academic subjects based on undergraduate degrees
  6. RDF tools – both content and metadata
  7. Allows institutions to identify, locate, describe and assess how they are managing their research data RDM as a means to engage with researchers (PhD students, early career scientists) with a view to embedding good RMD practice. Also introducing researchers to concept of data repositories as a home for research data output
  8. School of geosciences Digital Equipment and Database Enhancement for Impact AHRC call Target audience academic researchers and teachers, also repository developers, public archives, seconday teaching professionals, policy professionals and urban designers, in areas of housing, heritage and sustainable cities
  9. Initial stages of discussion – the impending JISC RDM Infrastructure call for developer time Scottish Alliance for Geoscience, Environment and Society – multi-university consortium c.f. collaboration and publication domains as developed at Monash