The document summarizes a research project conducted by the Cataloging and Metadata Services unit at Utah State University to analyze user search behavior and the performance of MARC records in search results. The project involved analyzing web logs of searches, scraping search results pages, and coding records and fields in Airtable. Key findings included that MARC records make up around 20% of search results on average, vendor records appear more often than locally created records, and the 245 and 505 fields were most important for retrieving records while the 505, 520 and 650 fields had the greatest impact if missing from records. Guidelines for cataloging practice were proposed based on the findings.
15 Student Data Secrets that Could Change Your Library, Number 5 Will Shock You
For two years librarians at Nevada State College have been collecting student-level data on library resource use and matching it to student success outcomes like retention and GPA. This presentation will share what we’ve learned about collecting, storing, and securing student-level data sets.
The document discusses challenges with managing electronic resources due to issues with metadata from content providers. It summarizes that incorrect, outdated, or incomplete metadata from publishers can lead to resources not being discoverable by users or libraries unaware they own content. The document then recommends solutions for libraries such as promoting metadata standards, documenting entitlements, and collaborating with other institutions and vendors to address problems in the complex data supply chain for e-resources.
This document provides an overview of data collection, interpretation, visualization and ethics. It discusses collecting both quantitative and qualitative data, cleaning data by addressing formatting errors and incomplete information. Data can be transformed through mathematical formulas or assigning categories. Statistics describe data while inferential statistics determine relationships. Visualization methods include tables, bar graphs and scatter plots. Best practices for visualization include clear labels, standard intervals and avoiding unnecessary complexity. Data collection and use raise ethical issues around privacy and informed consent.
Buy Only What You Need: Demand-Driven Acquisition as a Strategy for Academic ...
The document summarizes the University of Denver's implementation of demand-driven acquisition (DDA) for ebooks and print books. It discusses data showing a high percentage of unused books purchased under the previous just-in-case model. The new DDA model allows books to be purchased only after a certain number of uses or short-term loans, reducing unnecessary spending. The transition involves setting up plans with ebook vendors EBL and YBP to provide access and integrate purchasing workflows with the library system. Assessment of the new model will examine use data and purchasing patterns over time.
This document discusses a research study analyzing how freelance language teachers use Twitter for professional development. It employs Deleuzo-Guattarian concepts of assemblages, rhizomes, and becomings to analyze teachers' participation in Twitter networks like #ELTchat. Situational analysis and social network analysis were used to map relations between teachers, hashtags, and the "Twitter machine." Emerging findings suggest teachers' professional development occurs through unpredictable interactions within human and technological assemblages, reconfiguring understandings of teaching and professional learning.
This document contains Jeffrey Xavier's resume. It summarizes his work experience in educational evaluation, research assistance, data collection, and internships. It also lists his technical skills in SPSS, SQL, VBA, Java, and various computer programs and operating systems. Finally, it provides details of his graduate education in applied sociology and undergraduate dual major in sociology and psychology from UMass Dartmouth and Bishop Feehan High School.
Escape the data dungeon: Shedding light on strategies to share your findings
Presentation by Kimberly Vardeman and Jessica van Haaften at Designing for Digital on March 5, 2018.
We explore how to share user research results with library colleagues, students, and faculty. Our goal is to report information in a meaningful and useful way, while providing transparency and presenting the Library positively. We want to communicate how we take action—that user feedback doesn’t disappear into a data dungeon. We offer a review of how university library websites report findings publicly. We present successes and failures of sharing our work internally and externally.
Any questions or feedback, please contact me.
The document summarizes a presentation about using electronic data collection to inform staffing decisions at reference desks in the USF Tampa Libraries. Statistical data was collected using tools like tally sheets, clickers, and Aeon Desktracker to analyze reference questions. This data found that most questions were basic informational queries that could be handled by graduate assistants, and led to recommendations like single-staffing librarians during peak times and instituting better referral systems between departments. The libraries implemented these changes, reducing hours but maintaining coverage through virtual reference options. Evaluation of the data supported modifying scheduling and increasing focus on consultations and individual research assistance.
An introduction to University of Cape Town (UCT) Libraries resources, including navigating the website, understanding print and digital resources, getting to know a reference managing tool and enabling students to evaluate resources.
This document describes a study being conducted as part of LILAC, a multi-institutional initiative that analyzes gaps in students' information literacy skills. The study involves surveying and observing 50 students at Kennesaw State University to understand their research behaviors and abilities. Initial quantitative analysis found that most students conduct research on the web rather than academic databases and have trouble evaluating source types. Qualitative coding is being used to analyze observations of students' search and evaluation processes. The researchers hope to identify ways to help students improve their information literacy skills.
Crossref webinar: Anna Tolwinska - Crossref Participation Reports Metadata 09...
Online discovery portals are providing information about your content to researchers and linking to your site via Crossref. A richer record can result in significantly more traffic from places you weren’t expecting.
Learn about where publisher metadata goes, how it is used, and the importance of depositing rich metadata in making the most of these downstream services.
Our speakers include Stephanie Dawson of ScienceOpen; Pierre Mounier of OPERAS, OpenEdition, and the HIRMEOS project; and Laura J. Wilkinson and Anna Tolwinska of Crossref.
Webinar held September 11, 2018
Academic Library Impact: Improving Practice and Essential Areas to Research
The document discusses priority areas for researching the value and impact of academic libraries. It identifies the key areas as communication, mission alignment, learning analytics, student success, teaching and learning, and collaboration. For each area, it provides exemplar effective practices from literature and interviews with librarians and administrators. It then outlines potential research questions within each area and discusses research design considerations. The document concludes with an overview of a visualization tool being developed to showcase findings.
Presentation for my co-authored paper "Open University Data" on the CIIT conference in 2012. It describes the process and benefits of opening parts of the Faculty of Computer Science and Engineering data in a structured format.
NISO/BISG 7th Annual Changing Standards Landscape Forum: ALA Chicago User Pra...
This document summarizes findings from faculty surveys about use of scholarly monographs. It finds that monographs remain very important to researchers, especially in humanities. While e-book usage is growing, print still dominates for in-depth reading. Searching and skimming are easier digitally. Over time more believe e-books could replace print, though humanities remain less convinced. The document also notes historians' heavy reliance on Google Books for discovery and access.
How are MARC records performing in our search environment? This presentation will look at the process and results of a research project that analyzed how users’ search terms matched up with MARC fields, as well as how and where MARC records were displayed in search results lists. Presenters will discuss the process, the results of the project, and outline how attendees can implement similar research projects at their institutions, including tools and techniques they can use to analyze how their own records are surfacing in a search environment.
“More than Meets the Eye” - Analyzing the Success of User Queries in Oria
This document analyzes query data from the University of Oslo library search engine Oria to gain insights into search behavior and query success rates. The analysis found that (1) many of the most popular queries were for curriculum-related materials and had successful results, (2) "zero result" queries were often due to too specific queries like pasted references or misspellings, and (3) suggestions are provided to improve query suggestions, expand indexing, and better integrate curriculum materials to help resolve more queries.
A Close Look at the Four Million Archival MARC Records in WorldCat
Standards for archival description have been in place for more than thirty years, but what does actual practice look like? In this OCLC Research Library Partners Works in Progress webinar presented 3 December 2015, OCLC Research Program Officer Jackie Dooley gave an overview of her deep dive into the four million records for archival materials in WorldCat.
This document summarizes the results of a study analyzing circulation data and interlibrary loan (ILL) usage to evaluate the scope and coverage of a library's print collection. Some key findings include that 48% of titles have never circulated, 88% circulated 5 times or fewer, and 97% have not circulated in the last year. Subject areas with high ILL usage but low circulation may need increased purchasing. Challenges included inconsistencies in the library system data and scaling expectations about what data could be extracted and analyzed. The study aimed to identify areas of the collection that are over- or under-used to inform collection development and management decisions.
Internal cooperation and external satisfactionAnnegrete Wulff
1. The document outlines the divisions and responsibilities within Statistics Denmark for disseminating statistics through their website and StatBank database.
2. Key responsibilities include maintaining the website and StatBank, creating metadata for tables, loading and publishing data, and gathering user feedback.
3. An annual planning meeting is held between the statistics divisions and dissemination division to discuss new tables, follow-ups, challenges, and plans.
Ethan Pullman and Denise Novak presented on how librarians can stay informed about text mining to better support their constituents. Kristen Garlock discussed JSTOR's Data for Research service which allows researchers to generate datasets for text mining. Patricia Cleary provided an overview of Springer's text and data mining policy which allows researchers to text mine subscribed content for non-commercial research.
Learn about preliminary results of research undertaken to answer the question how have the Core Competencies for Electronic Resource Librarians, adopted in July 2013 by NASIG, affected the qualifications for and responsibilities of electronic resources librarians as they are depicted in job ads posted between 2012 and 2014.
15 Student Data Secrets that Could Change Your Library, Number 5 Will Shock YouTiffany Garrett
For two years librarians at Nevada State College have been collecting student-level data on library resource use and matching it to student success outcomes like retention and GPA. This presentation will share what we’ve learned about collecting, storing, and securing student-level data sets.
The document discusses challenges with managing electronic resources due to issues with metadata from content providers. It summarizes that incorrect, outdated, or incomplete metadata from publishers can lead to resources not being discoverable by users or libraries unaware they own content. The document then recommends solutions for libraries such as promoting metadata standards, documenting entitlements, and collaborating with other institutions and vendors to address problems in the complex data supply chain for e-resources.
This document provides an overview of data collection, interpretation, visualization and ethics. It discusses collecting both quantitative and qualitative data, cleaning data by addressing formatting errors and incomplete information. Data can be transformed through mathematical formulas or assigning categories. Statistics describe data while inferential statistics determine relationships. Visualization methods include tables, bar graphs and scatter plots. Best practices for visualization include clear labels, standard intervals and avoiding unnecessary complexity. Data collection and use raise ethical issues around privacy and informed consent.
Buy Only What You Need: Demand-Driven Acquisition as a Strategy for Academic ...Michael Levine-Clark
The document summarizes the University of Denver's implementation of demand-driven acquisition (DDA) for ebooks and print books. It discusses data showing a high percentage of unused books purchased under the previous just-in-case model. The new DDA model allows books to be purchased only after a certain number of uses or short-term loans, reducing unnecessary spending. The transition involves setting up plans with ebook vendors EBL and YBP to provide access and integrate purchasing workflows with the library system. Assessment of the new model will examine use data and purchasing patterns over time.
2018 02-13 pathways-data enquiry_martina_emkeDr Martina Emke
This document discusses a research study analyzing how freelance language teachers use Twitter for professional development. It employs Deleuzo-Guattarian concepts of assemblages, rhizomes, and becomings to analyze teachers' participation in Twitter networks like #ELTchat. Situational analysis and social network analysis were used to map relations between teachers, hashtags, and the "Twitter machine." Emerging findings suggest teachers' professional development occurs through unpredictable interactions within human and technological assemblages, reconfiguring understandings of teaching and professional learning.
This document contains Jeffrey Xavier's resume. It summarizes his work experience in educational evaluation, research assistance, data collection, and internships. It also lists his technical skills in SPSS, SQL, VBA, Java, and various computer programs and operating systems. Finally, it provides details of his graduate education in applied sociology and undergraduate dual major in sociology and psychology from UMass Dartmouth and Bishop Feehan High School.
Escape the data dungeon: Shedding light on strategies to share your findingsKimberly Vardeman
Presentation by Kimberly Vardeman and Jessica van Haaften at Designing for Digital on March 5, 2018.
We explore how to share user research results with library colleagues, students, and faculty. Our goal is to report information in a meaningful and useful way, while providing transparency and presenting the Library positively. We want to communicate how we take action—that user feedback doesn’t disappear into a data dungeon. We offer a review of how university library websites report findings publicly. We present successes and failures of sharing our work internally and externally.
Any questions or feedback, please contact me.
The document summarizes a presentation about using electronic data collection to inform staffing decisions at reference desks in the USF Tampa Libraries. Statistical data was collected using tools like tally sheets, clickers, and Aeon Desktracker to analyze reference questions. This data found that most questions were basic informational queries that could be handled by graduate assistants, and led to recommendations like single-staffing librarians during peak times and instituting better referral systems between departments. The libraries implemented these changes, reducing hours but maintaining coverage through virtual reference options. Evaluation of the data supported modifying scheduling and increasing focus on consultations and individual research assistance.
Resources in uct libraries is_hons_masters_2017Susanne Noll
An introduction to University of Cape Town (UCT) Libraries resources, including navigating the website, understanding print and digital resources, getting to know a reference managing tool and enabling students to evaluate resources.
This document describes a study being conducted as part of LILAC, a multi-institutional initiative that analyzes gaps in students' information literacy skills. The study involves surveying and observing 50 students at Kennesaw State University to understand their research behaviors and abilities. Initial quantitative analysis found that most students conduct research on the web rather than academic databases and have trouble evaluating source types. Qualitative coding is being used to analyze observations of students' search and evaluation processes. The researchers hope to identify ways to help students improve their information literacy skills.
Crossref webinar: Anna Tolwinska - Crossref Participation Reports Metadata 09...Crossref
Online discovery portals are providing information about your content to researchers and linking to your site via Crossref. A richer record can result in significantly more traffic from places you weren’t expecting.
Learn about where publisher metadata goes, how it is used, and the importance of depositing rich metadata in making the most of these downstream services.
Our speakers include Stephanie Dawson of ScienceOpen; Pierre Mounier of OPERAS, OpenEdition, and the HIRMEOS project; and Laura J. Wilkinson and Anna Tolwinska of Crossref.
Webinar held September 11, 2018
Academic Library Impact: Improving Practice and Essential Areas to ResearchLynn Connaway
The document discusses priority areas for researching the value and impact of academic libraries. It identifies the key areas as communication, mission alignment, learning analytics, student success, teaching and learning, and collaboration. For each area, it provides exemplar effective practices from literature and interviews with librarians and administrators. It then outlines potential research questions within each area and discusses research design considerations. The document concludes with an overview of a visualization tool being developed to showcase findings.
Presentation for my co-authored paper "Open University Data" on the CIIT conference in 2012. It describes the process and benefits of opening parts of the Faculty of Computer Science and Engineering data in a structured format.
This document summarizes findings from faculty surveys about use of scholarly monographs. It finds that monographs remain very important to researchers, especially in humanities. While e-book usage is growing, print still dominates for in-depth reading. Searching and skimming are easier digitally. Over time more believe e-books could replace print, though humanities remain less convinced. The document also notes historians' heavy reliance on Google Books for discovery and access.
How are MARC records performing in our search environment? This presentation will look at the process and results of a research project that analyzed how users’ search terms matched up with MARC fields, as well as how and where MARC records were displayed in search results lists. Presenters will discuss the process, the results of the project, and outline how attendees can implement similar research projects at their institutions, including tools and techniques they can use to analyze how their own records are surfacing in a search environment.
“More than Meets the Eye” - Analyzing the Success of User Queries in OriaTimelessFuture
This document analyzes query data from the University of Oslo library search engine Oria to gain insights into search behavior and query success rates. The analysis found that (1) many of the most popular queries were for curriculum-related materials and had successful results, (2) "zero result" queries were often due to too specific queries like pasted references or misspellings, and (3) suggestions are provided to improve query suggestions, expand indexing, and better integrate curriculum materials to help resolve more queries.
A Close Look at the Four Million Archival MARC Records in WorldCatOCLC
Standards for archival description have been in place for more than thirty years, but what does actual practice look like? In this OCLC Research Library Partners Works in Progress webinar presented 3 December 2015, OCLC Research Program Officer Jackie Dooley gave an overview of her deep dive into the four million records for archival materials in WorldCat.
This document summarizes the results of a study analyzing circulation data and interlibrary loan (ILL) usage to evaluate the scope and coverage of a library's print collection. Some key findings include that 48% of titles have never circulated, 88% circulated 5 times or fewer, and 97% have not circulated in the last year. Subject areas with high ILL usage but low circulation may need increased purchasing. Challenges included inconsistencies in the library system data and scaling expectations about what data could be extracted and analyzed. The study aimed to identify areas of the collection that are over- or under-used to inform collection development and management decisions.
This is an archive on a webinar delivered on January 12, 2012. Description: If you’re really new to cataloging, this session is for you. In this 90-minute online session, facilitated by NEKLS technology librarian Heather Braum, you will:
learn the basic principles behind cataloging,
discover why librarians catalog,
learn to read a basic MARC record,
see what a good MARC record looks like,
learn basic cataloging terminology,
and practice describing different materials.
Special thanks to Robin Fay for allowing me to use a couple of the ideas shared in this webinar and presentation. See her outstanding slides: http://www.slideshare.net/robinfay/cataloging-basics-presentation.
Report on Usability Process and Assessment of Yufindkramsey
This document summarizes the results of usability testing conducted on Yufind, an alternative interface for Yale University Library's catalog. Usability tests were conducted to evaluate whether users would see and successfully use facets to filter search results. While facets were sometimes seen and used, subsets did not always make sense and facets were hard to navigate. Based on the tests, recommendations were made to limit the number of facets displayed and make them more focused on user behavior. A survey found that over 50% of respondents preferred Yufind to the previous system and rated it positively. The usability process highlights assessing current user behavior, testing changes, and reassessing to determine standards and priority functionality.
Search is now normal behaviour: what do we do about that? November 2009Caroline Jarrett
An industry case study presented to OzCHI 2009: 21st Annual Conference of the Australian Computer-Human Interaction Special Interest Group (CHISIG) of the Human Factors and Ergonomics Society of Australia (HFESA), Melbourne, Australia
Search & Recommendation: Birds of a Feather?Toine Bogers
In just a little over half a century, the field of information retrieval has experienced spectacular growth and success, with IR applications such as search engines becoming a billion-dollar industry in the past decades. Recommender systems have seen an even more meteoric rise to success with wide-scale application by companies like Amazon, Facebook, and Netflix. But are search and recommendation really two different fields of research that address different problems with different sets of algorithms in papers published at distinct conferences?
In my talk, I want to argue that search and recommendation are more similar than they have been treated in the past decade. By looking more closely at the tasks and problems that search and recommendation try to solve, at the algorithms used to solve these problems and at the way their performance is evaluated, I want to show that there is no clear black and white division between the two. Instead, search and recommendation are part of a much more fluid continuum of methods and techniques for information access.
(Keynote at "Mind The Gap '14" workshop at the iConference 2014 in Berlin, Germany)
Discovery Systems: Connecting the 21st Century Academic User to ContentAthena Hoeppner
Discovery systems couple a central index of metadata and content with a feature-rich discovery layer to help users find information. UCF's discovery service indexes over 690 million records from various sources and links users to full text over 80% of the time. Studies found it included relevant high-quality content for nursing and science papers. Embedding discovery into learning management systems reduces cognitive load for online students and simplifies accessing full text from courses. Discovery services also expose open access outputs by including them prominently.
OA in the Library Collection: The Challenge of Identifying and Managing Open ...NASIG
Librarians, researchers, and the general public have largely embraced the concept of open access (OA). Yet, incorporating OA resources into existing discovery and tracking systems is often a complicated process. Open access material can be delivered through a variety of publishing or archival mechanisms, creating certain challenges, particularly for those managing e-resources. Although an increasing proportion of research output is becoming open access each year, organization and discovery of these resources remains imperfect.
The debate between the relative merits of Green and Gold OA is regularly discussed in academic circles but less attention is devoted towards Hybrid OA and the challenges inherent in this model. Most major publishers offer open access through one or more of these models, but open access metadata standards seem to be lacking among these content providers. The presenters will discuss some of these challenges identified in the literature and through other mechanisms, including data gathered by NISO and an original survey. By identifying these issues, the scholarly communication community can work together to improve discovery for end users.
Chris Bulock
Electronic Resources Librarian, SIUE Lovejoy Library
Chris is an Electronic Resources Librarian and NASIG member from the St. Louis area. His research and work are focused on improving the library user's experience. Chris is the recipient of the 2012 HARRASSOWITZ Charleston Conference Scholarship.
Nathan Hosburgh
Discovery & Systems Librarian, Rollins College
Nate Hosburgh is currently the Discovery & Systems Librarian at Rollins College in Winter Park, Florida as part of a revamped Collections & Systems department that includes ILL, collection development, acquisitions, systems, and technical services. Previously, he held positions managing e-resources at Montana State University and managing interlibrary loan & document delivery at Florida Institute of Technology in Melbourne
Online Catalogs: What Users and Librarians WantKaren S Calhoun
The document summarizes research conducted to understand what online catalog users and librarians want from metadata. Focus groups and surveys found that end users prioritize easy delivery of content over discovery and want more summaries and links to full text. Librarians placed more emphasis on accuracy but also recommended improvements like merging duplicates and adding tables of contents. Both groups saw a need for better search relevance.
Data curator: who is s / he? Findings of the IFLA Library Theory and Research...Anna Maria Tammaro
The document summarizes findings from a research project on data curation roles and responsibilities. It outlines the project's phases which included a literature review, content analysis of job postings, and interviews. The content analysis of over 400 job postings found that roles involved in data curation have diverse titles and responsibilities often include instruction, reference, outreach, access, and preservation services. Data curators work to ensure long-term access and understanding of research data across its lifecycle.
SharePoint Search out of the box for a word or two isn't that powerful. When combined with powerful properties and operators, search can really sing. To the informed user there are simple ways of getting the search results your looking for by learning some KQL the Keyword Query Language. In this session we spend most of the time in demo in the search interface, but these slides contain lots of tips and tricks for better search for users.
Toward an automated student feedback system for text based assignments - Pete...Blackboard APAC
As the use of blended learning environments and digital technologies become integrated into the higher education sector, rich technologies such as analytics have the ability to assist teaching staff identify students at risk, learning material that is not proving effective and learning site designs that aid and facilitate improved learning. More recently consideration has been given to automated essay scoring. Such systems can be used in a formative way, such as providing feedback on initial assignment drafts or summatively through the analysis of final assignment submissions. Further, providing students with quick feedback on written assignments opens the opportunity through formative feedback to improved learning outcomes.
This presentation details a current project developing a system to analyse text-based assignments. The project is being developed for broad application, but the findings focus on an undergraduate pilot subject: ‘Ideas that Shook the World’ (a compulsory first year Bachelor of Arts subject taught on 5 campuses to more than 1000 students by 15 staff). Preliminary results of a fist scan of assignments are presented and the issues raised in developing the system presented together with an outline of additional work planned for the project. It is believed the work will have wide application where text-based assignments are utilised for assessment.
From Exploration to Construction - How to Support the Complex Dynamics of In...TimelessFuture
Search engines on the Web provide a world of information at our fingertips, and the answers to many of our common questions are just one click away. However, for the complex and multifaceted tasks involving a process of knowledge construction, various information seeking models describe an intricate set of cognitive stages (Kuhlthau, 2004; Vakkari, 2001). These stages influence the interplay of users’ feelings, thoughts and actions. Despite the evidence of the models, common search engines, nowadays the prime intermediaries between information and user, still feature a streamlined set of 'ten blue links'. While efficient for lookup tasks, this approach may not be beneficial for supporting sustained information-intensive tasks and knowledge construction. Would there be other approaches to support the complex dynamics of these ventures? Based on previous experiments, this talk discusses how the utility of search functionality during different stages of complex tasks is essentially dynamic. This provides opportunities for designing 'stage-aware' search systems, which may evolve along with a user's information journey.
Workshop presented at Webdagene 2013 (http://webdagene.no/en/) September 9, 2013; UX Lisbon (http://www.ux-lx.com), May 12, 2011; UX Hong Kong (http://www.uxhongkong.com/), February 17, 2011.
Presentation made during the Intelligent User-Adapted Interfaces: Design and Multi-Modal Evaluation Workshop (IUadaptME) workshop conducted as part of UMAP 2018
Don't Go There! Providing Discovery Services Locally, not at a Vendor's SiteKen Varnum
This document discusses the University of Michigan Library's approach to building their discovery services locally rather than using a vendor's site. Some key points:
- They built their own discovery interface because they did not have enough knowledge to integrate a vendor's tool and wanted control over the user experience.
- Having the discovery on their own site allows them to track user behavior like full-text clicks and provide local support.
- Search data shows most users search across articles and library resources, with the majority using the library's discovery interface.
- They have since enhanced discovery by adding direct article linking and a problem reporting mechanism.
The five-year plan outlines Frontiers' goals of becoming a leader in open access publishing through platform developments that integrate content with social networks and data mining. Key focuses include expanding specialty sections and research topics to facilitate interdisciplinary collaboration, growing the editorial board internationally, and reinforcing quality controls. Frontiers aims to publish over 1 million articles annually from China and the US by embracing open science principles and leveraging technology to integrate published content, metadata, and networks.
Avoiding a Level of Discontent in Finding Aids: An Analysis of User Engagemen...Andrea Payant
As part of a multi-faceted research project examining user engagement with various types of descriptive metadata, Utah State University Libraries Cataloging and Metadata Services unit (CMS) investigated the discoverability of local Encoded Archival Description (EAD) finding aids. The research team put two versions of the same finding aid online with one described at the file (box or folder) level and the other at the item-level. Over a year later, the team pulled the analytics for each guide and assessed which descriptive level was most frequently accessed. The research team also looked at the type of search terms patrons utilized and wherein the finding aid they were located. Usage data shows that personal names are the most common type of search term, search terms are most commonly found in the Collection Inventory, and that the availability of item-level description improves discovery by an average of 6,100% over file-level descriptions.
This document outlines best practices for building digital collections through community crowdsourcing efforts. It discusses strategies for gathering metadata and historical information from local communities in person through meetings with historical groups and individual interviews, as well as online through web forms and comments. Lessons learned include the importance of community partnerships, making the process approachable, and thanking contributors to encourage further participation.
At Utah State University, a pilot project is under development to evaluate the benefits of tracking data sets and faculty publications using the online catalog and the Library’s institutional repository.
With federal mandates to make publications and data open, universities look for solutions to track compliance. At Utah State University, the Sponsored Programs Office follows up with researchers to determine where data has been or will be deposited, per the terms of their grant.
Interested in making this publicly discoverable, the Library, Sponsored Programs, and Research Office are working together to pilot a project that enables the creation of publicly accessible MARC and Dublin Core records for data deposited by USU faculty. This project aims to make data sets, as well as publications, visible in research portals such as WorldCat, as well through Google searches.
This presentation will describe the project and anticipated benefits, as well as outline the roles of the cataloging staff and data librarian, and the involvement of the Research Office.
The Missing Link: Metadata Conversion Workflows for EveryoneAndrea Payant
This document describes workflows developed by Utah State University and the University of Nevada, Las Vegas to streamline metadata creation between special collections and digital initiatives departments. The workflows allow for converting finding aid information into Dublin Core for uploading item records to a digital repository, and batch linking digitized content to finding aids. The processes are designed to be taught easily and performed by various staff levels to automate metadata work and make it more flexible.
Mitigating the Risk: identifying Strategic University Partnerships for Compli...Andrea Payant
Payant, A., Rozum, B., Woolcott, L. (2016). Mitigating the Risk: Identifying Strategic University Partnerships for Compliance Tracking of Research Data and Publications. International Federation of Library Associations (IFLA) Satellite Conference: Data in Libraries: The Big Picture
Just Keep Cataloging: How One Cataloging Unit Changed Their Workflows to Fit ...Andrea Payant
Utah State University Libraries Cataloging and Metadata Services (CMS) unit, including student workers, transitioned to remote cataloging in March 2020 due to the COVID-19 pandemic. The presentation will outline the process undertaken by supervisors to evaluate and modify services and workflows to continue cataloging materials through the different phases of library capacity from shutting down most of the library, to a hybrid limited staff capacity, through staff back in the library full-time.
But Were We Successful: Using Online Asynchronous Focus Groups to Evaluate Li...Andrea Payant
USU launched a program in 2016 to connect researchers seeking federal funding with librarians to assist them with data management. This program assisted over 100 researchers, but was it successful? Our presentation will discuss how we evaluated the success of this program using online asynchronous focus groups (OAFG) in conjunction with a traditional survey. Our cross-institutional research team will share our findings as well as the challenges and successes of using OAFGs to assess library services.
Assessment and Visualization Tools for Technical ServicesAndrea Payant
A survey and demonstration of open source, freely available tools to help technical services units assess their work, collect and analyze data, create infographics, and visually demonstrate their impact on the library and their patrons.
The document discusses research data management at Utah State University (USU). It provides a history of USU's data management efforts beginning in 2013 with the creation of a campus committee and the hiring of a Data Librarian in 2015. The librarians developed a compliance program to meet federal requirements for data sharing and launched it in 2016. They now provide standard resources like a website and consultations, as well as non-standard services like annual communication with researchers regarding data deposit requirements. The document concludes with suggestions for backing up data using the "Rule of 3," describing data adequately, and organizing data files and directories.
liwalaawiiloxhbakaa (How We Lived): The Grant Bulltail Absáalooke (Crow Natio...Andrea Payant
USU was selected to host a unique collection of oral histories from Grant Bulltail, Crow Storyteller and 2019 NEA National Heritage Fellow, representing the stories and knowledge of the Crow Nation as passed down by his ancestors. The collection spans 20+ years of field work and collaboration across library departments and regional partners.
Crowdsourcing Metadata Practices at USUAndrea Payant
USU Libraries’ Cataloging and Metadata Unit has successfully investigated several methods to engage the public to involve them in the creation of metadata for USU’s Digital History Collections. Most, if not all the techniques we have tested have yielded positive results and have improved the relevancy and accuracy of our descriptive metadata.
Homeward Bound: How to Move an Entire Cataloging Unit to Remote WorkAndrea Payant
Utah State University Libraries Cataloging and Metadata Services (CMS) unit, including student workers, transitioned to remote cataloging in March 2020 due to the COVID-19 pandemic. This presentation will outline the process undertaken by supervisors to evaluate and modify services and workflows to continue cataloging service during the time when the library was shut down.
Outlines the development of the two single-service point and education initiatives, describes feedback gathered from a survey, and discusses how the Cataloging and Metadata Services unit plans to adapt services based on findings
Charting Communication: Assessment and Visualization Tools for Mapping the Co...Andrea Payant
The document summarizes a study conducted by Becky Skeen, Liz Woolcott, and Andrea Payant at Utah State University on assessing communication patterns within their cataloging and metadata services department. They used interaction logs filled out by staff weekly and an anonymous survey distributed to other library departments. The study found lower than expected interaction with other technical services units and higher interaction with special collections. It also contradicted stereotypes of catalogers being withdrawn by finding most interactions were social. The data analysis tools used included Excel, Qualtrics, Tableau and OpenRefine. Conducting this assessment on a regular basis and expanding the research was recommended to provide more useful insights into communication over time.
Memes of Resistance, Election Reflections, and Voices from Drug Court: Social...Andrea Payant
Folklorists and librarians have long championed social justice and advocacy issues. Today, the skills garnered through principled academic discourse, community based ethnographic fieldwork, and ethical librarianship are being utilized to collect, preserve, present, and educate around social themes and issues. USU folklorists and librarians are working to create robust digital collections that focus on timely social issues with informed and ethical metadata.
Giving Credit Where Credit is Due: Author and Funder IDsAndrea Payant
A process to include standardized funder and author identifiers into institutional repository and ILS records which are associated with funded research data
VOCAB for Collaboration: How “Work Language” Can Help You Win at TeamworkAndrea Payant
Clair Canfield's VOCAB model provides a framework for effective collaboration through vulnerability, ownership, communication, acceptance, and boundaries. The document discusses each element of the model and provides tips for incorporating them into teamwork. It suggests taking time for reflection, setting group agreements, embracing different communication styles, taking accountability, and accepting realities outside of one's control. Practicing these concepts can help teams work through challenges, utilize individual strengths, and adapt to change.
Can You Scan This For Me? Making the Most of Patron Digitization Request in t...Andrea Payant
This document discusses Utah State University's process for handling patron requests to digitize materials from the archives. It outlines the evolution from self-serve scanning to a mediated scanning service with a charge. The main challenges are lack of consistency, turnaround time, and documentation. The solution was to create an online digitization request form and standardized workflow. Initial results showed around 90 requests since implementation, with most being made available online. Next steps include linking digital items to finding aids and expanding the process to more complex requests within collections.
Wisdom of the Crowd: Successful Ways to Engage the Public in Metadata CreationAndrea Payant
Utah State University Libraries’ Cataloging and Metadata Unit has successfully used several methods to engage the public in metadata creation for USU’s Digital History Collections.
Airline Satisfaction Project using Azure
This presentation is created as a foundation of understanding and comparing data science/machine learning solutions made in Python notebooks locally and on Azure cloud, as a part of Course DP-100 - Designing and Implementing a Data Science Solution on Azure.
Amazon DocumentDB(MongoDB와 호환됨)는 빠르고 안정적이며 완전 관리형 데이터베이스 서비스입니다. Amazon DocumentDB를 사용하면 클라우드에서 MongoDB 호환 데이터베이스를 쉽게 설치, 운영 및 규모를 조정할 수 있습니다. Amazon DocumentDB를 사용하면 MongoDB에서 사용하는 것과 동일한 애플리케이션 코드를 실행하고 동일한 드라이버와 도구를 사용하는 것을 실습합니다.
1. MARC-y MARC and the Coding Bunch
Anna-Maria Arnljots
Metadata Assistant
anna-maria.arnljots@usu.edu
Paul Daybell
Archival Cataloging Librarian
paul.daybell@usu.edu
Kurt Meyer
Government Information and E-
Resource Cataloger
kurt.meyer@usu.edu
Andrea Payant
Metadata Librarian
andrea.payant@usu.edu
Becky Skeen
Special Collection Cataloging Librarian
becky.skeen@usu.edu
Liz Woolcott
Cataloging and Metadata Services Unit Head
liz.woolcott@usu.edu
Utah Library Association Annual Conference
May 21, 2021
2. 2
Background
• Multi-year research into user search behavior for all metadata
standards employed by the unit
First phase: MARC
Next phases: EAD, Dublin Core
• Project started just as the library moved everyone to work from
home
• Whole unit was able to participate in the coding project
3. Problem Statement
What is the correlation between
user search terms, the placement
of MARC records in search results
lists, and the performance of
individual MARC fields in a search
process?
4. Research Questions
• What is the frequency and
placement of MARC records in
search results list?
• Where are Search terms
located in MARC Records?
6. • Focused on the Discovery Layer (Encore)
because it was the primary search portal used
by patrons
• Pulled list of all URLs accessed on three days
• Put into Airtable and coded
Web Log Analysis
7. • Filtered for URLs that lead to search results pages
• Fed URLs into Octoparse, a web-scrapping tool
• Scrapped the list of search results, URLs, pagination,
and results #
• Numbered the results and put into Airtable, linked to
originating URL
Web Scraping
8. • Search Results List and URLs
Extracted bib #
Created formula to link to MARC view of bib
Unit members pulled up Bib record and copy/pasted it into
Airtable
Assigned codes for :
o Creator of record
o Material type
o MARC fields where term was found
o Fields that were not present
Automated formula examined wordcount of record
Airtable
9. • Web Log URLs
Coded for basic search features:
o Page Types
o Advanced Search fields used
o Facets used
o Page Number
Coded the queries (search terms) for:
o Search term construction
o Search categories (known item, topical)
o User Path
o Known Item Titles
Airtable (continued)
10. • Known Items pulled out specifically and coded (most for a
separate project looking at the discovery layer)
Format/Genre
Availability
Physical or Electronic
Location
Steps to access
Listed by
Final Content Provider
Checkouts
Discoverability in Google Scholar
o Steps to Access
Airtable (continued)
12. Analysis 1.1:
How frequently are MARC records showing up in search results?
Batch 1 Batch 2 Batch 3 Combined
MARC-based catalog records 5264 3299 4749 13312
Records from other platforms 20326 17560 16811 54697
Total Records 25603 20859 21560 68022
Percent MARC records 20.56% 15.82% 22.03% 19.57%
13. Analysis 1.2:
Is there a difference between locally created records and vendor supplied records in
the frequency of listing in search results?
Record Creator
# Records in
results list
% Total records in
results list
# Records
accessed
% Total records
accessed
Vendor 7,727 58.05% 163 39.00%
Cataloging and Metadata Services 5,066 38.06% 239 57.18%
Distance Campus Libraries 410 3.08% 5 1.20%
Record unavailable at time of coding 52 0.39% 2 0.48%
Patron Services, Library Media Collections, or
Resource Sharing and Document Delivery
33 0.25% 8 1.91%
Acquisitions 16 0.12% 0 0.00%
Unknown 5 0.04% 1 0.24%
Natural History Library 3 0.02% 0 0.00%
Total 13,312 418
14. Analysis 1.3:
How are MARC records ranked in the search results list?
• Most common position for MARC records in a search
result set of 25 items, is position 4
• MARC records appear in the top five search results
25.35% of the time
15. Analysis 1.4:
Where do MARC records for known items rank in the search results list?
Percentage of Times Available Whole Object Appeared in Search Results by Position Number
Result 1 Result 2 Result 3 Result 4 Result 5
Results
6-10
Results
11-15
Results
16-20
Results
21-25
Total # 125 107 61 49 37 104 67 56 35
% in
results
18.7% 16.0% 9.1% 7.3% 5.5% 15.6% 10.0% 8.4% 5.2%
17. Analysis 2.1:
What fields are used most in retrieving records?
9100
4998 4806
3700
1328
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
245 505 650 520 600
Number
of
Records
MARC Fields
MARC Fields Where Search Terms Were Located (Top 5)
18. Analysis 2.2:
For records accessed by the patron, is there a difference in where search terms are
located?
• The 245 Title statement remained highest, appearing 64% more
often than the next most utilized field
• Instead of the 505 Formatted Contents Note being in second
place, the 650 Subject Added Entry is the next most used field
• The 505 Formatted Contents Note and 520 Summary fields
retained a spot in the top four fields
19. Analysis 2.3:
For locally created records and vendor-supplied records, is there a difference in
where search terms are located?
Percentage of fields used in record retrieval (top 5 most frequent)
Field Field Description CMS Records Vendor Records
245 Title Statement 43.80% 51.64%
505 Formatted Contents Note 28.13% 69.65%
650 Subject Added Entry - Topical 40.89% 56.58%
520 Summary, etc. 23.41% 76.03%
600 Subject Added Entry – Personal Name 59.94% 32.68%
20. Analysis 2.4:
What fields are not present in the records?
CMS Vendor
Not Present Present Not Present Present
Author (both 1xx and 7xx) 0.75% 99.25% 1.18% 98.82%
Subject (any authorized) 4.46% 95.54% 6.73% 93.27%
505 Formatted Contents Note 63.96% 36.04% 45.54% 54.46%
520 Summary Note 75.60% 24.40% 50.45% 49.55%
All Categories Present 14.86% 33.26%
21. Analysis 2.5:
Which fields would make the greatest impact if not included in the record?
• The top four fields with the greatest impact on retrieval, if not
found in a record: 505, 245, 520, and 650
• Without the 505 or 520, 16.86% of all records appearing in
results would not have shown up
• In contrast, without 650 and 600 fields, only 0.66% of records
would not have appeared in the search results
23. 23
• Non-MARC records
have advantage
over MARC
Of all records in search results
are Non-MARC
Analysis
• MARC vendor records
appear more often
than locally created
MARC records
Of MARC records place in the
top 5 search results.
Occur more
frequently in
vendor records
Occur at the same
rate in Vendor and
Locally created
records
24. 24
Analysis
Title fields are most important over all, but…
• Ranked higher than
245 for records where
search terms matched
only one field
• Consistently in the
top 4 fields that
retrieved a record
(along with 520)
• If missing, 12% of
all MARC results
would not have
been displayed
25. 25
Analysis
Subject fields are important But…
Most important field for
matching search terms
Most important field for
records viewed by patrons
Would not have
been displayed if
field were missing
Instance of
subject fields
being “clicked on”
1xx fields were much more likely to be “clicked on”
26. ▫ Cataloger will retain ability to make best judgment for each
record, but will be asked to consider the following
guidelines:
- More emphasis on creating 505 and 520 notes in local
records
- Less emphasis on 6xx fields as an entry point
- More emphasis on 1xx fields as an entry point
26
Take-Aways
27. MARC-y MARC's Coding Bunch
• Anna-Maria Arnljots
• Josee Butler
• Ryan Bushman (Stats)
• Paul Daybell
• Barbara Fleming
• Maddie Gardner
• Alisha Grant
• Bryn Larsen
• Sabrina Leatham
• Rachel Olsen
• Andrea Payant
• Kurt Meyer
• Jessica Mills
• Abby Rodabough
• MaKayla Roundy
• Melanie Shaw
• Becky Skeen
• Sara Skindelien
• Seth Westenburg
• Liz Woolcott
29. Full Procedures: https://usulibrary.atlassian.net/l/c/8H7jgU98
Article with final results:
Liz Woolcott, Andrea Payant, Becky Skeen & Paul Daybell (2021) Missing the
MARC: Utilization of MARC Fields in the Search Process, Cataloging &
Classification Quarterly, 59:1, 28-52, DOI: 10.1080/01639374.2021.1881010
Related articles
Robert Heaton & Liz Woolcott. Unraveling the (Search) String: Assessing Library
Discovery Layers Using Patron Queries. Library Assessment Conference, January
2021
• Presentation: https://www.libraryassessment.org/program/2020-
schedule/#jan21
• Paper: https://www.libraryassessment.org/2020-proceedings/
30. Questions?
Anna-Maria Arnljots
Metadata Assistant
anna-maria.arnljots@usu.edu
Paul Daybell
Archival Cataloging Librarian
paul.daybell@usu.edu
Kurt Meyer
Government Information and E-
Resource Cataloger
kurt.meyer@usu.edu
Andrea Payant
Metadata Librarian
andrea.payant@usu.edu
Becky Skeen
Special Collection Cataloging Librarian
becky.skeen@usu.edu
Liz Woolcott
Cataloging and Metadata Services Unit Head
liz.woolcott@usu.edu
I will now give you a quick overview of our methodology for our project
In order to determine how MARC records interacted with the user search process, the research team examined the logs of URLs that were generated by Encore, our library’s discovery layer.
Each search session in Encore generates a combination of static and dynamic URLs. Dynamic URLs capture a user’s search terms and any facets selected, advanced search categories used, additional search result pages accessed, and bibliographic record numbers for MARC record pages.
Google Analytics was used to gather reports of time-stamped, URL logs generated over the course of multiple days.
Resulting data was put into Airtable, a relational database for further analysis
The Google analytics report of URL logs was downloaded, and dynamic URLs that led to a search results page were isolated from the main report and fed into Octoparse, a web scraping tool. Each resulting page from the dynamic URL was scraped by Octoparse to gather data for the search terms used, the number of results on the page, the total number of results available to the user, and the title and link of each item in the list of results presented to the user on that page.
The results were numbered and added to our Airtable database and then linked to the originating URL.
Search results list and urls were coded to identify the bibliographic record number.
A formula was created within the system to link out to the MARC view which was used to access and copy the full text of the MARC record into Airtable.
Codes were assigned for record creator (whether generated by library personnel or vendor supplied) and material type.
Codes also identified where the search terms appeared in the MARC record and they also related prominent categories of fields that were not present in the record.
For every instance where the search term appeared in the field, that field was copied into a separate column for further analysis.
Also, an automated formula examined the word count of each record.
Web logs URLs were also coded for basic search features, including page types, advanced search fields, facets used, and search result page numbers
Queries, or search terms, were coded as well to parse out how search terms were constructed, search categories (either known item or topical), user paths, and known item titles.
Finally, known item searches were pulled out and coded. The search terms entered by the user were analyzed through a multi-step process that reran the same terms in a browser to ascertain if the search terms reasonably matched the title or identifier of a known item.
When found, the corresponding URLs were tagged as Known Items and coded for format, availability, medium, location, keywords used etc.
Following this coding, each known item was double checked by a research team member to determine if the library provided access to it, either physically or in electronic format.
Paul will now go over the results of our data and coding
So, just to summarize what Paul said. Non-MARC records have clear advantage over MARC in our discovery layer. 80% of all results came from non-MARC sources, despite non-MARC records making up 60% of the database. AND MARC records only place in top 5 results a quarter of the time.
If we just look at MARC records by themselves, though, we see that Vendor records appear more often than locally created records and are more likely to include the 505 and 520 fields. They have the same frequency of author and subject fields as records cataloged locally, though, so 1xx and 6xx fields are not making a difference between the two types of records.
We suspect that full text search in non-MARC records and the greater presence of 505 and 520 fields in Vendor records provide more words and phrases for the index to search against. And that our own work is less visible because we aren’t putting our emphasis in these places.
In fact, if we look further into how the 505 functions, we find that while title fields were the most important field overall, the 505 ranked higher than 245 for records where search terms matched only one field (meaning those search terms weren’t found anywhere else in the record.) The 505 and 520 Summary Notes were consistently in the top 4 fields that retrieved a record
Most telling of all was that in 12% of all records, if 505 had not been present, the record would not have been displayed in the search results list AT ALL. The only other field more significant that this was the Title field
Let’s take a look now at how authorized fields like the subject and author field interact with search terms. Subject fields are important, but results on how they interact with search terms are mixed, It is the 3rd most important field for matching search terms and the 2nd most important field for records viewed by patrons, but only .55% of records would not have been displayed if the Subject field had been missing. So, while the data demonstrated that search terms matched subject headings frequently, it also demonstrated that those same terms were frequently available elsewhere in the record already.
Additionally, it was very obvious that subject headings were rarely ever used as a means for finding other materials (for instance, when we envision a patron "clicking on" a subject link to find like materials.") There was only one instance of subject fields being “clicked on” to bring up related records. This is, in large part, due to the visibility of subject headings on the main search page,. You can only access the terms through the record itself (if the patron clicks on it) or on occasion in a “tag” field at the bottom of the facet column. Whether due to interface design or to the utility of the field itself, we cannot definitely say. However, 1xx creator fields were the most likely authorized heading fields to be used and the data displayed evidence of them being used to find related records and materials. They are also the more visible of the authorized headings fields – not only showing up in the search results list, but also being actionable from that list without having to enter the record.
In reviewing all the data, the unit developed a few "take-aways" that we could incorporate in our day-to-day work. These included taking more time to add 505 Formatted Content Notes or 520 Summary fields to locally created records. We felt the data demonstrated that additional 505 and 520 fields would likely make our records more visible to the search algorithms. Additionally, we will place less emphasis on the subject fields as part of our workflow. This doesn't mean eliminating subject work from what we do – but rather just not spending as much time developing subject headings as before. We will also continue our authority work on the 1xx creator fields, as they are the most visible of the controlled headings fields and also highly visible in the search results page. These aren't hard and fast rules, but rather guidelines to follow. Our catalogers will continue to be able to exercise their own judgment when creating records. But having this understanding of how the records are used will be imperative in that judgment making process.
We would like to thank the following people for all of their help in making this research process possible. The whole Cataloging unit at USU Libraries, including catalogers, cataloging assistants, and student technicians participated in this project. We would also like to thank Ryan Bushman, the assistant to our Assessment Librarian for all his help with the statistics for this project. We are so appreciative to this whole coding bunch!
If you would like to try out this process yourself – we have put our step by step instructions online at the URL you see above. This will include all of the procedures we used to pull the data from Google Analytics, scrape the data with Octoparse, and our codebooks that all of the project contributors used. We will also put this link into the chat for you.
You can also read about this process and the results in our recently published article in Cataloging and Classification Quarterly. It is titled "Missing the MARC: Utilization of MARC Fields in the Search Process." and the link DOI above is a link to the article. We will also put that into the chat for you. Note that both of these links are available on the handout for this session, too.
The data from this project was also used in a recent publication and presentation at the Library Assessment Conference which examined how patrons used the Library Discovery Layer Encore. The links are available on this slide and we will put them into chat as well. Just note that the proceedings are quite up yet, but should be soon.
Thank you for your time! Does anyone have any questions?