SlideShare a Scribd company logo
Data Citation: 
A Critical Role for Publishers 
Brian Hole, Founder and CEO 
SciDataCon 2014, Citing Data to Facilitate Multidisciplinary Research session 
New Delhi, November 5 2014 
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
Overview 
 Why is data citation 
important? 
 Publisher guidelines 
 Copyediting with data 
in mind 
 Data papers 
 Machine readability 
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
The Social Contract 
of Science 
• Dissemination 
• Validation 
• Further development 
Scientific Malpractice 
• Results 
• Data 
• Software 
• Hardware, wetware… 
#@%$#@ 
% #@%$# 
Source: http://www.smbc-comics.com/index.php?db=comics&id=2015 
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
2. Publisher Guidelines 
• No single way to cite data, but good guidelines 
available (e.g. Force 11) 
• Journal must have clear guidelines about how to 
cite data, e.g.: 
• Creators, date of publication, host repository, 
version, persistent identifier 
• Must be included in reference list 
Alexander NS, Wint W (2013) Data from: Projected population 
proximity indices (30km) for 2005, 2030 & 2050. Dryad 
Digital Repository. http://dx.doi.org/10.5061/dryad.12734 
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
3. Copyediting with Data in Mind 
• Publishers need to provide better guidelines for 
copyeditors: 
• Make sure journal guidelines for data 
citation are being followed 
• Go back to authors if no citation included 
• Fix incorrect citations (e.g. simple hyperlinks 
in text) 
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
4. Data papers 
• Data papers incentivize authors to follow good 
practice in releasing and citing data: 
• Data professional is lead author 
• Paper advertises work, encourages reuse, 
collaboration, indicates impact 
• Makes citation much easier: 
• Data is automatically cited correctly in 
data paper 
• Data paper is naturally included in 
reference list of research papers 
• Citations etc. can be tracked 
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
5. Machine Readable Citations 
• Many data reuse scenarios involve locating, 
querying and recombining data from a large 
number of sources 
• This can be made significantly easier by 
making data citations machine readable 
• Enables locating of data via text mining of 
relevant literature 
• Two possible methods – XML and RDF 
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
• Journal Article Tag 
Suite (JATS) 
maintained by 
NISO used by most 
publishers 
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress 
XML 
• JATS currently 
recommends 
tagging data 
references as web 
publications 
<ref> 
<element-citation publication-type="database" 
publication-format="web"> 
<source>Database of Human Disease Causing 
Gene Homologues in Dictyostelium Discoideum 
[Internet] </source> 
<publisher-loc>San Diego (CA)</publisher-loc> 
<publisher-name>San Diego Supercomputer 
Center</publisher-name> 
<year>2003</year> 
<date-in-citation>cited 2007 Feb 2 
</date-in-citation> 
<comment>Available from: <uri>http:// 
dictyworkbench.sdsc.edu/HDGDD/</uri>. 
</comment> 
</element-citation> 
</ref> 
• Not ideal, but 
available now
• Several proposals for improvements with more 
suitable terms: 
• NISO-JATS Data Citation Implementation Workshop 
held at the British Library in June 2014 
• Force11 Data Citation Implementation Group 
• <JATS4R> publisher group 
• E.g.: 
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress 
XML 
<name> -> <collab collab-type="curators"> 
<source> -> <data-title> 
<edition> -> <version> 
<license>
• Can make data not only discoverable through citation, 
but also the relationship of it to the research. 
4. Oldenburg H (1665). "Epistle Dedicatory". Philosophical Transactions of the Royal Society 
of London 1: 0–0. doi:10 .1098/rstl.1665.0001. 
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress 
RDF 
• JATS to RDF provides a start for this, but publishers 
still need to make data citations more specific 
<rdf:Description rdf:about="reference-item-4"><co:index>4</co:index></rdf:Description><rdf:Description rdf:about="reference-4"><dcterms:bibliographicCitation> 
Oldenburg, H (1665). </dcterms:bibliographicCitation></rdf:Description><rdf:Description rdf:about="reference-4"><rdf:type 
rdf:resource="http://purl.org/spar/biro/BibliographicReference"/></rdf:Description><rdf:Description rdf:about="reference- 
4"><dcterms:identifier>b4</dcterms:identifier></rdf:Description><rdf:Description rdf:about="reference-4"><biro:references rdf:resource="reference-4-textual-entity"/></ 
rdf:Description><rdf:Description rdf:about="textual-entity"><cito:cites rdf:resource="reference-4-textual-entity"/></rdf:Description><rdf:Description 
rdf:about="reference-4-textual-entity"><rdf:type rdf:resource="http://purl.org/spar/fabio/Expression"/><frbr:realizationOf rdf:resource="reference-4-conceptual-work"/></ 
rdf:Description><rdf:Description rdf:about="reference-4-textual-entity"><rdf:type 
rdf:resource="http://purl.org/spar/fabio/JournalArticle"/></rdf:Description><rdf:Description rdf:about="reference-4-conceptual-work"><dcterms:creator 
rdf:resource="reference-4-agent-1"/></rdf:Description><rdf:Description rdf:about="reference-4-agent-1"><rdf:type 
rdf:resource="http://xmlns.com/foaf/0.1/Person"/></rdf:Description><rdf:Description rdf:about="reference-4-agent- 
1"><foaf:familyName>Oldenburg</foaf:familyName></rdf:Description><rdf:Description rdf:about="reference-4-agent- 
1"><foaf:givenName>H</foaf:givenName></rdf:Description><rdf:Description rdf:about="reference-4-textual-entity">< 
fabio:hasPublicationYear>1665</fabio:hasPublicationYear></rdf:Description><rdf:Description rdf:about="reference-4-textual-entity"><dcterms:title>Epistle 
Dedicatory</dcterms:title></rdf:Description><rdf:Description rdf:about="reference-4-textual-entity-source"><dcterms:title>Philosophical Transactions of the Royal Society 
of London</dcterms:title></rdf:Description><rdf:Description rdf:about="reference-4-textual-entity"><frbr:partOf rdf:resource="reference-4-textual-entity-source"/></ 
rdf:Description><rdf:Description rdf:about="reference-4-textual-entity"><frbr:partOf rdf:resource="periodical-volume-reference-4-textual-entity"/></ 
rdf:Description><rdf:Description rdf:about="periodical-volume-reference-4-textual-entity"><rdf:type 
rdf:resource="http://purl.org/spar/fabio/PeriodicalVolume"/><prism:volume>1</prism:volume><frbr:partOf><rdf:Description><rdf:type 
rdf:resource="http://purl.org/spar/fabio/Periodical"/></rdf:Description></frbr:partOf></rdf:Description><rdf:Description rdf:about="reference-4-textual-entity">< 
frbr:embodiment rdf:resource="digital-embodiment-d1e2589"/></rdf:Description><rdf:Description rdf:about="digital-embodiment-d1e2589">< 
prism:startingPage rdf:resource="0"/></rdf:Description><rdf:Description rdf:about="reference-4-textual-entity">< 
prism:doi>10.1098/rstl.1665.0001</prism:doi></rdf:Description>
• The Citation Typing Ontology (CiTO) is available 
now, and makes the relationship explicit: 
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress 
RDF 
<a about="http://dx.doi.org/10.5334 /jophd.ab" 
rel="cito:Uses_Data_From" 
href="http://dx.doi.org/10.5061/dryad.12734”>http:// 
dx.doi.org/10.5061/dryad.12734</a>
Summary 
• Clear publisher guidelines. 
• Copyediting with data in mind 
• Using data papers 
• Ensuring machine readability of citations 
Any questions? 
Please feel free to contact 
brian.hole@ubiquitypress.com 
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress

More Related Content

Data Citation: A Critical Role for Publishers

  • 1. Data Citation: A Critical Role for Publishers Brian Hole, Founder and CEO SciDataCon 2014, Citing Data to Facilitate Multidisciplinary Research session New Delhi, November 5 2014 brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
  • 2. Overview  Why is data citation important?  Publisher guidelines  Copyediting with data in mind  Data papers  Machine readability brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
  • 3. The Social Contract of Science • Dissemination • Validation • Further development Scientific Malpractice • Results • Data • Software • Hardware, wetware… #@%$#@ % #@%$# Source: http://www.smbc-comics.com/index.php?db=comics&id=2015 brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
  • 5. 2. Publisher Guidelines • No single way to cite data, but good guidelines available (e.g. Force 11) • Journal must have clear guidelines about how to cite data, e.g.: • Creators, date of publication, host repository, version, persistent identifier • Must be included in reference list Alexander NS, Wint W (2013) Data from: Projected population proximity indices (30km) for 2005, 2030 & 2050. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.12734 brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
  • 6. 3. Copyediting with Data in Mind • Publishers need to provide better guidelines for copyeditors: • Make sure journal guidelines for data citation are being followed • Go back to authors if no citation included • Fix incorrect citations (e.g. simple hyperlinks in text) brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
  • 7. 4. Data papers • Data papers incentivize authors to follow good practice in releasing and citing data: • Data professional is lead author • Paper advertises work, encourages reuse, collaboration, indicates impact • Makes citation much easier: • Data is automatically cited correctly in data paper • Data paper is naturally included in reference list of research papers • Citations etc. can be tracked brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
  • 10. 5. Machine Readable Citations • Many data reuse scenarios involve locating, querying and recombining data from a large number of sources • This can be made significantly easier by making data citations machine readable • Enables locating of data via text mining of relevant literature • Two possible methods – XML and RDF brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
  • 11. • Journal Article Tag Suite (JATS) maintained by NISO used by most publishers brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress XML • JATS currently recommends tagging data references as web publications <ref> <element-citation publication-type="database" publication-format="web"> <source>Database of Human Disease Causing Gene Homologues in Dictyostelium Discoideum [Internet] </source> <publisher-loc>San Diego (CA)</publisher-loc> <publisher-name>San Diego Supercomputer Center</publisher-name> <year>2003</year> <date-in-citation>cited 2007 Feb 2 </date-in-citation> <comment>Available from: <uri>http:// dictyworkbench.sdsc.edu/HDGDD/</uri>. </comment> </element-citation> </ref> • Not ideal, but available now
  • 12. • Several proposals for improvements with more suitable terms: • NISO-JATS Data Citation Implementation Workshop held at the British Library in June 2014 • Force11 Data Citation Implementation Group • <JATS4R> publisher group • E.g.: brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress XML <name> -> <collab collab-type="curators"> <source> -> <data-title> <edition> -> <version> <license>
  • 13. • Can make data not only discoverable through citation, but also the relationship of it to the research. 4. Oldenburg H (1665). "Epistle Dedicatory". Philosophical Transactions of the Royal Society of London 1: 0–0. doi:10 .1098/rstl.1665.0001. brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress RDF • JATS to RDF provides a start for this, but publishers still need to make data citations more specific <rdf:Description rdf:about="reference-item-4"><co:index>4</co:index></rdf:Description><rdf:Description rdf:about="reference-4"><dcterms:bibliographicCitation> Oldenburg, H (1665). </dcterms:bibliographicCitation></rdf:Description><rdf:Description rdf:about="reference-4"><rdf:type rdf:resource="http://purl.org/spar/biro/BibliographicReference"/></rdf:Description><rdf:Description rdf:about="reference- 4"><dcterms:identifier>b4</dcterms:identifier></rdf:Description><rdf:Description rdf:about="reference-4"><biro:references rdf:resource="reference-4-textual-entity"/></ rdf:Description><rdf:Description rdf:about="textual-entity"><cito:cites rdf:resource="reference-4-textual-entity"/></rdf:Description><rdf:Description rdf:about="reference-4-textual-entity"><rdf:type rdf:resource="http://purl.org/spar/fabio/Expression"/><frbr:realizationOf rdf:resource="reference-4-conceptual-work"/></ rdf:Description><rdf:Description rdf:about="reference-4-textual-entity"><rdf:type rdf:resource="http://purl.org/spar/fabio/JournalArticle"/></rdf:Description><rdf:Description rdf:about="reference-4-conceptual-work"><dcterms:creator rdf:resource="reference-4-agent-1"/></rdf:Description><rdf:Description rdf:about="reference-4-agent-1"><rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/></rdf:Description><rdf:Description rdf:about="reference-4-agent- 1"><foaf:familyName>Oldenburg</foaf:familyName></rdf:Description><rdf:Description rdf:about="reference-4-agent- 1"><foaf:givenName>H</foaf:givenName></rdf:Description><rdf:Description rdf:about="reference-4-textual-entity">< fabio:hasPublicationYear>1665</fabio:hasPublicationYear></rdf:Description><rdf:Description rdf:about="reference-4-textual-entity"><dcterms:title>Epistle Dedicatory</dcterms:title></rdf:Description><rdf:Description rdf:about="reference-4-textual-entity-source"><dcterms:title>Philosophical Transactions of the Royal Society of London</dcterms:title></rdf:Description><rdf:Description rdf:about="reference-4-textual-entity"><frbr:partOf rdf:resource="reference-4-textual-entity-source"/></ rdf:Description><rdf:Description rdf:about="reference-4-textual-entity"><frbr:partOf rdf:resource="periodical-volume-reference-4-textual-entity"/></ rdf:Description><rdf:Description rdf:about="periodical-volume-reference-4-textual-entity"><rdf:type rdf:resource="http://purl.org/spar/fabio/PeriodicalVolume"/><prism:volume>1</prism:volume><frbr:partOf><rdf:Description><rdf:type rdf:resource="http://purl.org/spar/fabio/Periodical"/></rdf:Description></frbr:partOf></rdf:Description><rdf:Description rdf:about="reference-4-textual-entity">< frbr:embodiment rdf:resource="digital-embodiment-d1e2589"/></rdf:Description><rdf:Description rdf:about="digital-embodiment-d1e2589">< prism:startingPage rdf:resource="0"/></rdf:Description><rdf:Description rdf:about="reference-4-textual-entity">< prism:doi>10.1098/rstl.1665.0001</prism:doi></rdf:Description>
  • 14. • The Citation Typing Ontology (CiTO) is available now, and makes the relationship explicit: brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress RDF <a about="http://dx.doi.org/10.5334 /jophd.ab" rel="cito:Uses_Data_From" href="http://dx.doi.org/10.5061/dryad.12734”>http:// dx.doi.org/10.5061/dryad.12734</a>
  • 15. Summary • Clear publisher guidelines. • Copyediting with data in mind • Using data papers • Ensuring machine readability of citations Any questions? Please feel free to contact brian.hole@ubiquitypress.com brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress

Editor's Notes

  1. This is for Stuart from the Royal Society