The document discusses the Names Project in the UK, which aims to create unique identifiers for UK researchers. It began in 2007 using data from a research assessment exercise. The Names Project takes a hybrid approach, using automated matching and manual disambiguation. It also allows researchers to directly input information. The project seeks to improve data quality and integrate with other national and international identifier systems like ISNI. Key challenges include gaining agreement on national researcher identifier services.
Report
Share
Report
Share
1 of 32
More Related Content
Similar to How dinosaurs broke our system: challenges in building national researcher identifier services
Riding the wave - Paradigm shifts in information accessdatacite
The document discusses the paradigm shifts in scientific information access over time from empirical observation to computational simulation. It outlines the challenges libraries now face in providing access to non-textual scientific content like research data and simulations. The document also introduces DataCite, a global consortium that issues digital object identifiers (DOIs) to datasets to help make them accessible, citable, and traceable like scholarly articles.
The document discusses scratchpads, which are websites for taxonomists to publish and share their research. It describes how scratchpads allow taxonomists to manage taxonomic data, reference bibliographies, images, phylogenies, character matrices, distribution maps, and specimen records. Over 200 scratchpad communities have been created, with over 2,500 users publishing over 300,000 pages of content. The ViBRANT project aims to further develop and support scratchpads as a virtual research environment for taxonomists.
The document discusses how bio-ontologies and natural language processing can enable open science by facilitating structured knowledge representation and collaborative curation. It describes services provided by the National Center for Biomedical Ontology (NCBO) that allow use of ontologies for annotation, data aggregation, and accelerating the curation process. Several groups are highlighted that utilize NCBO services for applications such as clinical trial matching, specimen banking, and data summarization.
Today libraries face more and new challenges when enabling access to information. The growing amount of information in combination with new non-textual media-types demands a constant changing of grown workflows and standard definitions. Knowledge, as published through scientific literature, is the last step in a process originating from primary scientific data. These data are analysed, synthesised, interpreted, and the outcome of this process is published as a scientific article. Access to the original data as the foundation of knowledge has become an important issue throughout the world and different projects have started to find solutions.
Nevertheless science itself is international; scientists are involved in global unions and projects, they share their scientific information with colleagues all over the world, they use national as well as foreign information providers.
When facing the challenge of increasing access to research data, a possible approach should be global cooperation for data access via national representatives:
* a global cooperation, because scientists work globally, scientific data are created and accessed globally.
* with national representatives, because most scientists are embedded in their national funding structures and research organisations.
DataCite was officially launched on December 1st 2009 in London and has 12 information institutions and libraries from nine countries as members. By assigning DOI names to data sets, data becomes citable and can easily be linked to from scientific publications.
Data integration with text is an important aspect of scientific collaboration. DataCite takes global leadership for promoting the use of persistent identifiers for datasets, to satisfy the needs of scientists. Through its members, it establishs and promotes common methods, best practices, and guidance. The member organisations work independently with data centres and other holders of research data sets in their own domains. Based on the work of the German National Library of Science and Technology (TIB) as the first DOI-Registration Agency for data, DataCite has registered over 850,000 research objects with DOI names, thus starting to bridge the gap between data centers, publishers and libraries.
This presentation will introduce the work of DataCite and give examples how scientific data can be included in library catalogues and linked to from scholarly publications.
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...OpenAIRE
OpenAIRE Interoperability Workshop (8 Feb. 2013).
DataCite – Bridging the gap and helping to find, access and reuse data – Herbert Gruttemeier, INIST-CNRS
Publishing of Scientific Data - Science Foundation Ireland Summit 2010jodischneider
This document discusses trends in publishing scientific data, including requirements to deposit data, citing data through identifiers like DOIs, considering data itself as a publication in data journals or databases, and including interactive data within publications. It also outlines new roles for working with scientific data, such as data scientists and curators who extract facts from literature to populate databases and ensure data quality.
The document discusses the role of libraries in providing access to research data and introduces DataCite, a global consortium focused on improving infrastructure for research datasets. Key points:
- Scientific information now includes non-textual data, requiring libraries to provide access to datasets in addition to publications.
- DataCite provides DOIs and standards to help data repositories and publishers improve identification, citation and discovery of datasets.
- DataCite has over 15 member institutions worldwide and has registered over 800,000 datasets with DOIs to help connect publications to underlying research data.
ViBRANT—Virtual Biodiversity Research and Access Network for TaxonomyVince Smith
Presented by Dave Roberts and coauthored by Vince Smith at BioIdentify 2010, the National Muséum of Natural History (MNHN), Paris, France. 20-22 Sept, 2010.
OCLC Research @ U of Calgary: New directions for metadata workflows across li...OCLC Research
Presentation used as scene setting for 2 days worth of discussion around library, archive & museum convergence, metadata workflows and single search at the University of Calgary.
This document provides an agenda and overview for a conference on data exploration, sharing, and management hosted by ICPSR. The first session will cover data exploration tools like ICPSR's integrated search engine and Social Science Variables Database. The second will discuss sharing 2010 US Census and other public data. The final session will address data management plans and computing/sharing in secure environments. ICPSR is one of the world's largest social science data archives, housing over 7,000 studies and 65,000 datasets. It seeks to facilitate research through data preservation, dissemination, and educational resources.
China: Journal Publishing, DOI and CrossCheck (2011 CrossRef Workshops)Crossref
This document summarizes information about journal publishing in China. It notes that China publishes around 5,000 scientific journals annually, with around 220 having English editions. Major digital databases that index Chinese journals are also discussed, including CNKI and WanFang Data, which together index over 170 million articles. The use of digital identifiers like DOI in China is still developing, with challenges around integrating Chinese-language journals and databases with international identifiers and services like CrossRef and CrossCheck.
Improving Discoverability with Unique Identifiers: ORCID, ISNI, and Implement...ORCID, Inc
The document discusses the use of persistent identifiers like ORCID and ISNI to disambiguate authors and connect research outputs and activities. It provides an overview of how ORCID and ISNI work, their differences and complementarity, adoption rates, and plans for further interoperability between the two systems through linking identifiers and exchange of metadata to better connect researcher profiles and works across research workflows and databases.
Just about everyone is familiar with the ISBN for books and the ISSN for serials. But new identifiers and new identifier standards have been developed for resources—such as the International Standard Text Code (ISTC)— and for people and organizations—such as the International Standard Name Identifier (ISNI). NISO's January 2012 webinar, Identify This! Identify That! New Identifiers and New Uses—to be held on January 11 from 1:00 to 2:30 p.m. EST—will discuss several new identifiers as well as new uses for older identifiers.
An introduction to the Joint Information Systems Committee Resource Discovery iKit. Includes a look at controlled vocabularies declared in the Resource Discovery Framework (RDF)/Simple Knowledge Organisation System (SKOS) and wikipedia entries. Presented by Tony Ross at the CILIPS Centenary Conference Branch and Group Day which took place 5 Jun 2008.
The document discusses Japan Link Center's (JaLC) experiment to register DOIs for research data. The experiment aims to establish workflows for registering DOIs for research data using JaLC's system. It involves 9 projects with 14 organizations testing DOI registration for research data. The document outlines several issues in registering DOIs for data, including operations flow, persistent access, granularity, dynamics of data, and quantity of data. It also provides examples of how projects can involve multiple institutions and how data lifecycles differ from literature.
Semantic Linking & Retrieval for Digital LibrariesStefan Dietze
An overview of recent works on entitiy linking and retrieval in large corpora, specifically bibliographic data. The works address both traditional Linked Data and knowledge graphs as well as data extracted from Web markup, such as the Web Data Commons.
Similar to How dinosaurs broke our system: challenges in building national researcher identifier services (20)
Managing Small Archives provides an overview of establishing and running an archives service for a small institution. It discusses establishing authority and a mission statement, as well as developing policies for acquisitions, physical control of collections, and intellectual control through inventories, appraisal, accessioning, arrangement and description. The document outlines best practices for storage conditions, disaster planning and handling of archival materials. Intellectual control ensures that collections are organized and described to provide access and understand the context in which they were created.
Slides for a presentation at the conference of the Association of Canadian Archivists in Victoria, British Columbia, in June 2014. The talk was about an event aimed at bringing communities together. It grew out of a finding aid of historical documents which had been used to support a First Nations land claim in Eastern Ontario (http://www.archeion.ca/culbertson-tract-land-claim-supporting-documents-collection;rad).
Beyond the Cenotaph: a 21st century commemorationAmanda Hill
This document describes research into men from Deseronto, Ontario who served in World War 1. Over 290 names were identified, many more than the 34 listed on the original war memorial. Sources like ancestry records and newspapers helped uncover more names. The research scope expanded to include men born in Deseronto or with similar-sounding place names. Additional pilots who trained in Deseronto were also included. The researcher plans to create an online memorial by blogging about each individual's service. Freely available online archives are valuable resources but managing project scope is important.
Presentation given at an Archives Association of Ontario Professional Development Committee workshop on February 7th, 2014. Explains how to create records describing archive creators and the archives themselves using Archeion, Ontario's archival network, which runs on the AtoM software from Artefactual Systems.
Working outside the walls: from gatekeeper to keymasterAmanda Hill
The document discusses how archivists are becoming increasingly invisible as they make resources available online, reaching larger audiences. It argues that as long as users can easily find and understand information, and stakeholders understand the value of archives, then having a low public profile does not matter. The document advocates for archivists to transform from experts behind institutional walls to mentors in the community. It suggests engaging community members in archives and lowering standards for inclusion to involve more people.
An introduction to using archives for family historians, presented on May 4th, 2013, at a one-day conference organized by the Toronto branch of the Ontario Genealogical Society.
This document provides information about the Archeion Workshop. Archeion is Ontario's archive information network that holds over 8,000 archival descriptions from more than 70 institutions across Ontario. The workshop discusses the structure of Archeion, including how it separates institutional information, archival descriptions, and record creator information. It also reviews how to create and edit records in Archeion, including institutional records, authority records for creators, and archival descriptions at different levels. Tips are provided for writing concise and accessible summaries.
Introduction to arrangement and description (feb 4&5, 2012)Amanda Hill
This document provides an overview of archival arrangement and description. It discusses key principles such as provenance, original order, and respect des fonds. It explains how archivists arrange records into logical groupings like fonds and series. The document also covers descriptive standards like RAD and key elements of archival description at the fonds and series level. The goal of arrangement and description is to provide intellectual control over archival materials and enable access for users.
Exploring Strange New Worlds: Archives TNGAmanda Hill
Presentation on the impact of using Web 2.0 technologies in a small municipal archives, given at the Association of Canadian Archivists' conference in Halifax, Nova Scotia, June 2010.
The document discusses the Archives 2.0 activities of the Deseronto Archives in Ontario, Canada. It provides context on the small staff and budget of the archives. It then details how the archives created accounts on Flickr, Twitter, and a blog to engage with users and raise its profile. It measures the basic activity on these channels, such as over 350 photos posted to Flickr with over 8,000 views. The archives has seen positive impacts, including user comments, new accessions of materials, and contact with those who have information to share. However, maintaining an Archives 2.0 presence requires ongoing effort from archives staff.
The document discusses the Names Project, which aims to create a name authority service to help improve metadata and search across institutional repositories. It notes that repositories currently have issues with inconsistent author names leading to incomplete search results. The project conducted scoping work and is building a prototype database based on the FRAD model that assigns unique identifiers to individuals and institutions. This would allow repositories and search services to link information about authors more precisely across repositories.
Presentation given at <a href="http://www.jisc.ac.uk/whatwedo/themes/access_management/federation/federation_events/programmtgjune08.aspx">JISC Identity Management: Future Directions Day</a>, 30 June 2008
The document discusses the Names Project which aims to create a name authority service to help disambiguate author names when depositing materials in institutional repositories. It notes that people depositing content need to add accurate metadata about creators and affiliations but these are entered inconsistently. The project will develop a prototype database of name records over 17 months to test solutions and work with repositories to integrate name disambiguation. It will seek input from an expert panel and aims to make it easier for creators to assign unambiguous identities when submitting materials.
A Question Of Interpretation: the role of archivists in an online ageAmanda Hill
This document summarizes the changing role of archivists in the digital age. Technology and user expectations have changed the way archivists provide access to archival materials. Archivists are no longer gatekeepers as users can now find archival materials online through search engines and platforms like Archives Hub. Archivists must adapt by improving online finding aids, using less jargon, and better understanding different types of online users to ensure valued archival materials remain accessible and interpreted for new audiences.
The document discusses the Archives Hub, a union catalogue of finding aids for archives held in UK universities, colleges, and other organizations. It has over 20,000 finding aids from 150 institutions. The Archives Hub developed open source software called Cheshire for Archives to allow institutions to host and manage their own finding aid data through a shared infrastructure. This provides interoperability without effort and gives control back to repositories while also making the data more accessible.
Opening up the archives: from basement to browserAmanda Hill
The document summarizes the current state of archive gateways in the UK that provide access to archival descriptions and collections. It describes several existing networks like Archives Hub, AIM25, and A2A that aggregate finding aids from different repositories. Archives Hub aims to be a single point of access for archives in educational institutions. It has grown significantly since starting as a pilot in 1999 and now includes descriptions from over 150 repositories, though some collections only have brief level descriptions while others include item-level details. Future plans include transitioning to a more distributed model where repositories can host their own data and moving to new protocols to expose the data.
7 Most Powerful Solar Storms in the History of Earth.pdfEnterprise Wired
Solar Storms (Geo Magnetic Storms) are the motion of accelerated charged particles in the solar environment with high velocities due to the coronal mass ejection (CME).
Transcript: Details of description part II: Describing images in practice - T...BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and slides: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
Blockchain technology is transforming industries and reshaping the way we conduct business, manage data, and secure transactions. Whether you're new to blockchain or looking to deepen your knowledge, our guidebook, "Blockchain for Dummies", is your ultimate resource.
Mitigating the Impact of State Management in Cloud Stream Processing SystemsScyllaDB
Stream processing is a crucial component of modern data infrastructure, but constructing an efficient and scalable stream processing system can be challenging. Decoupling compute and storage architecture has emerged as an effective solution to these challenges, but it can introduce high latency issues, especially when dealing with complex continuous queries that necessitate managing extra-large internal states.
In this talk, we focus on addressing the high latency issues associated with S3 storage in stream processing systems that employ a decoupled compute and storage architecture. We delve into the root causes of latency in this context and explore various techniques to minimize the impact of S3 latency on stream processing performance. Our proposed approach is to implement a tiered storage mechanism that leverages a blend of high-performance and low-cost storage tiers to reduce data movement between the compute and storage layers while maintaining efficient processing.
Throughout the talk, we will present experimental results that demonstrate the effectiveness of our approach in mitigating the impact of S3 latency on stream processing. By the end of the talk, attendees will have gained insights into how to optimize their stream processing systems for reduced latency and improved cost-efficiency.
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfNeo4j
Presented at Gartner Data & Analytics, London Maty 2024. BT Group has used the Neo4j Graph Database to enable impressive digital transformation programs over the last 6 years. By re-imagining their operational support systems to adopt self-serve and data lead principles they have substantially reduced the number of applications and complexity of their operations. The result has been a substantial reduction in risk and costs while improving time to value, innovation, and process automation. Join this session to hear their story, the lessons they learned along the way and how their future innovation plans include the exploration of uses of EKG + Generative AI.
Quality Patents: Patents That Stand the Test of TimeAurora Consulting
Is your patent a vanity piece of paper for your office wall? Or is it a reliable, defendable, assertable, property right? The difference is often quality.
Is your patent simply a transactional cost and a large pile of legal bills for your startup? Or is it a leverageable asset worthy of attracting precious investment dollars, worth its cost in multiples of valuation? The difference is often quality.
Is your patent application only good enough to get through the examination process? Or has it been crafted to stand the tests of time and varied audiences if you later need to assert that document against an infringer, find yourself litigating with it in an Article 3 Court at the hands of a judge and jury, God forbid, end up having to defend its validity at the PTAB, or even needing to use it to block pirated imports at the International Trade Commission? The difference is often quality.
Quality will be our focus for a good chunk of the remainder of this season. What goes into a quality patent, and where possible, how do you get it without breaking the bank?
** Episode Overview **
In this first episode of our quality series, Kristen Hansen and the panel discuss:
⦿ What do we mean when we say patent quality?
⦿ Why is patent quality important?
⦿ How to balance quality and budget
⦿ The importance of searching, continuations, and draftsperson domain expertise
⦿ Very practical tips, tricks, examples, and Kristen’s Musts for drafting quality applications
https://www.aurorapatents.com/patently-strategic-podcast.html
The Rise of Supernetwork Data Intensive ComputingLarry Smarr
Invited Remote Lecture to SC21
The International Conference for High Performance Computing, Networking, Storage, and Analysis
St. Louis, Missouri
November 18, 2021
Measuring the Impact of Network Latency at TwitterScyllaDB
Widya Salim and Victor Ma will outline the causal impact analysis, framework, and key learnings used to quantify the impact of reducing Twitter's network latency.
UiPath Community Day Kraków: Devs4Devs ConferenceUiPathCommunity
We are honored to launch and host this event for our UiPath Polish Community, with the help of our partners - Proservartner!
We certainly hope we have managed to spike your interest in the subjects to be presented and the incredible networking opportunities at hand, too!
Check out our proposed agenda below 👇👇
08:30 ☕ Welcome coffee (30')
09:00 Opening note/ Intro to UiPath Community (10')
Cristina Vidu, Global Manager, Marketing Community @UiPath
Dawid Kot, Digital Transformation Lead @Proservartner
09:10 Cloud migration - Proservartner & DOVISTA case study (30')
Marcin Drozdowski, Automation CoE Manager @DOVISTA
Pawel Kamiński, RPA developer @DOVISTA
Mikolaj Zielinski, UiPath MVP, Senior Solutions Engineer @Proservartner
09:40 From bottlenecks to breakthroughs: Citizen Development in action (25')
Pawel Poplawski, Director, Improvement and Automation @McCormick & Company
Michał Cieślak, Senior Manager, Automation Programs @McCormick & Company
10:05 Next-level bots: API integration in UiPath Studio (30')
Mikolaj Zielinski, UiPath MVP, Senior Solutions Engineer @Proservartner
10:35 ☕ Coffee Break (15')
10:50 Document Understanding with my RPA Companion (45')
Ewa Gruszka, Enterprise Sales Specialist, AI & ML @UiPath
11:35 Power up your Robots: GenAI and GPT in REFramework (45')
Krzysztof Karaszewski, Global RPA Product Manager
12:20 🍕 Lunch Break (1hr)
13:20 From Concept to Quality: UiPath Test Suite for AI-powered Knowledge Bots (30')
Kamil Miśko, UiPath MVP, Senior RPA Developer @Zurich Insurance
13:50 Communications Mining - focus on AI capabilities (30')
Thomasz Wierzbicki, Business Analyst @Office Samurai
14:20 Polish MVP panel: Insights on MVP award achievements and career profiling
Best Programming Language for Civil EngineersAwais Yaseen
The integration of programming into civil engineering is transforming the industry. We can design complex infrastructure projects and analyse large datasets. Imagine revolutionizing the way we build our cities and infrastructure, all by the power of coding. Programming skills are no longer just a bonus—they’re a game changer in this era.
Technology is revolutionizing civil engineering by integrating advanced tools and techniques. Programming allows for the automation of repetitive tasks, enhancing the accuracy of designs, simulations, and analyses. With the advent of artificial intelligence and machine learning, engineers can now predict structural behaviors under various conditions, optimize material usage, and improve project planning.
Kief Morris rethinks the infrastructure code delivery lifecycle, advocating for a shift towards composable infrastructure systems. We should shift to designing around deployable components rather than code modules, use more useful levels of abstraction, and drive design and deployment from applications rather than bottom-up, monolithic architecture and delivery.
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionBert Blevins
Cybersecurity is a major concern in today's connected digital world. Threats to organizations are constantly evolving and have the potential to compromise sensitive information, disrupt operations, and lead to significant financial losses. Traditional cybersecurity techniques often fall short against modern attackers. Therefore, advanced techniques for cyber security analysis and anomaly detection are essential for protecting digital assets. This blog explores these cutting-edge methods, providing a comprehensive overview of their application and importance.
YOUR RELIABLE WEB DESIGN & DEVELOPMENT TEAM — FOR LASTING SUCCESS
WPRiders is a web development company specialized in WordPress and WooCommerce websites and plugins for customers around the world. The company is headquartered in Bucharest, Romania, but our team members are located all over the world. Our customers are primarily from the US and Western Europe, but we have clients from Australia, Canada and other areas as well.
Some facts about WPRiders and why we are one of the best firms around:
More than 700 five-star reviews! You can check them here.
1500 WordPress projects delivered.
We respond 80% faster than other firms! Data provided by Freshdesk.
We’ve been in business since 2015.
We are located in 7 countries and have 22 team members.
With so many projects delivered, our team knows what works and what doesn’t when it comes to WordPress and WooCommerce.
Our team members are:
- highly experienced developers (employees & contractors with 5 -10+ years of experience),
- great designers with an eye for UX/UI with 10+ years of experience
- project managers with development background who speak both tech and non-tech
- QA specialists
- Conversion Rate Optimisation - CRO experts
They are all working together to provide you with the best possible service. We are passionate about WordPress, and we love creating custom solutions that help our clients achieve their goals.
At WPRiders, we are committed to building long-term relationships with our clients. We believe in accountability, in doing the right thing, as well as in transparency and open communication. You can read more about WPRiders on the About us page.
How dinosaurs broke our system: challenges in building national researcher identifier services
1. How dinosaurs broke our system
Challenges in building national researcher
identifier services
Amanda Hill
Names Project
JISC Conference, 2010
2. Hoping that…
…Simeon has explained all about the name
authority problem
I‟d like to talk about some of the work
that we‟ve done as part of the Names
Project recently…
…and how that fits into today‟s researcher
identification landscape
3. Gross generalisation about past
approaches to author identifiers
Libraries Publishers
Book-level data Article-level data
Labour intensive: Automatically generated:
disambiguation first disambiguation later
Authors not involved Authors can edit
Open Proprietary
4. Current international activity
ISNI ORCID
Library-instigated Publisher-instigated
Disambiguation first Disambiguation later
Authors not involved Authors can submit/edit
Broad scope Current researchers
JISC Conference, 2010
5. Signs of convergence?
Knowledge Exchange meeting on Digital
Author Identifiers in March 2012
encouraged alignment of ISNI and ORCID
approaches
ISNI has reserved a block of identifiers for
use by ORCID
JISC Conference, 2010
6. Sources of information
Both ORCID and ISNI will use existing pools
of information to populate their systems
ISNI: “Leveraging high confidence data from
different domains”
“ORCID will link to other name identifier
systems”
JISC Conference, 2010
7. National author ID systems
2011: JISC-funded survey and report on
national author/researcher identifier
systems around the world
Report published November 2011
http://ie-repository.jisc.ac.uk/567/
8. Maturity of systems (late 2011)
System In development since Number of identities
Lattes (Brazil) 1999 1,600,000
31,000 researchers at 160
Frida/Cristin (Norway) 2003
institutions
24,400 faculty with profiles
VIVO 2003 150,000 total IDs including
undisambiguated co-authors
40,000 in the NTA
Digital Author Identifier 2005 (1980s for National Thesaurus
15,000 researchers with Digital
(Netherlands) of Author Names)
Author IDs
Names Project (UK) 2007 46,000
New Zealand Electronic Text
2007 2,000
Centre
Trove People and
Organisations/NLA Party 2007 900,000 people and organisations
Infrastructure (Australia)
AuthorClaim 2008 200
Researcher Name Resolver
2008 190,000
(Japan)
9. Populating identifier systems
System Records created by Records imported from Records generated by
cataloguers other systems data subjects
AuthorClaim
Digital Author Identifier
(Netherlands)
Frida/Cristin (Norway)
Lattes (Brazil)
Names Project (UK)
New Zealand Electronic Text
Centre
Researcher Name Resolver
(Japan)
Trove People and
Organisations/NLA Party
Infrastructure (Australia)
VIVO
10. Good sources of data for some
nations
National system Existing unique identifiers
Researcher identifiers from national
Japan
researcher databases
Number from National Thesaurus of
Netherlands Author names is converted into
Digital Author Identifier
Human resources data: social security
Norway
numbers
Other national systems assign new
identifiers as new identities are
established.
11. Features of mature national
identifier systems
With more mature systems:
A national organisation generally has oversight: e.g. in
Brazil, Norway, Netherlands
Integration with research funders, reporting agencies
and institutional repositories
Individual institutions also have defined roles
relating to managing information about their own
staff
13. Work to investigate unique IDs
for UK researchers
Identified in 2006 as part of the call for
proposals for the JISC-funded Repositories
and Preservation Programme
Mimas and the British Library proposed a two-
year project to:
Investigate requirements for a UK name authority
service
Build a pilot system to demonstrate potential
14. The Names Project
The Chang Project
„From the Annals of the Onomastic
Society‟
Ian Watson (1990)
15. Names (not an acronym…)
Name Authorities Make Everything Simpler
Names: Ambiguous, Meaningful (or
Meaningless?), Essential, Symbolic
…nearly everyone has a name-related
story
17. Original plan
Use data from British Library‟s Zetoc service to
create author IDs
Journal article information from 1993->
Last names, initials, paper titles, subject
classifications
But…
International in scope
Lack of information on affiliations and first names to
help with making matches
Huge dataset -> processing issues
18. Revised plan
Used 2008 Research Assessment Exercise
data (as cleaned up by JISC Merit project)
to pre-populate the Names system
Identify unique individuals and assign
identifiers
Data quality good, included institutional
information: high accuracy, despite only
having initials, not full first names
Except for…
JISC Conference, 2010
21. Building on Merit…
Merit data covers around 20% of active UK
researchers
Working to enhance records and create
new ones with information from other
sources
Institutional repositories
British Library data sets (Zetoc)
Direct input from researchers
24. Quality matters
Automatic matching can only achieve so
much
Dependent on data source
British Library team perform manual check of
results of matching new data sources
Allows for separation/merging of records
Plan to allow people to update their own
information
25. Ultimate aim
High-quality set of unique identifiers for UK
researchers and research institutions
Available to other systems (national and
international)
e.g. Names records exported to ISNI in 2011
Possible additional services
Disambiguation of existing data sets
Identification of external researchers
26. Access to Names
API allows for flexible searching of Names
data
EPrints plugin released in 2011: allows
repository users to choose from a list of
Names identities
…and to create a Names record if none exists
JISC Conference, 2010
29. Next steps…
JISC-convened Researcher ID group – final
meeting in September > recommendations
Options Appraisal Report for UK national
researcher identifier service > December
Improving data and adding new records
JISC Conference, 2010
30. Summing up
Names is a hybrid of library/publisher
approaches
Automated matching/disambiguation
Human quality checks
Data immediately available for re-use in other
systems
Researchers can supply information
31. An evolving area
Main challenges are cultural and political
rather than technical
National author/researcher ID services can be
important parts of research infrastructure
Getting agreement and co-ordination at
national level is vital
…and, I would say, are all very jealous of those countries with ready-made data sources like this…
Namey anecdote here? Dicky Moore & Robin Armstrong Viner?
Known in name authority circles as ‘the Siveter problem’
Every time we add a new data set, the quality of the data within the Names pilot improves – recently added information from the University of the West of England – QA process highlighted a previously unnoticed problem with the original Merit data.