PCC Wikidata Pilot at Texas A&M University Libraries Progress Report
Initiating the PCC Wikidata Pilot at Texas A&M University Libraries, the project aims to familiarize with Wikidata, experiment with linked data, and enhance discoverability of institutional records. Focusing on Mechanical Engineering Department's faculty advisors, graduate students, and dissertations, the team is drafting application profiles, creating Wikidata items, and exploring automation methods to enrich data and promote transparency in academic information management.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
The PCC Wikidata Pilot at Texas The PCC Wikidata Pilot at Texas A&M University Libraries: A&M University Libraries: A Progress Report Jeannette Ho March 24, 2021
Our Goals for Getting Involved with the Pilot: To gain familiarity with Wikidata and its tools To compare the process of creating Wikidata entries with our traditional process of authority control To expose data about the contents of our collections on the wider web and make it more discoverable to the public To demonstrate the value of linked data by experimenting with Wikidata's SPARQL Query service to explore relationships among the entities we plan to create To create items for persons and organizations at our university and affiliated agencies that may not have identities established for them that can may help disambiguate authors in our institutional repository
Our project: We will focus on faculty advisors, graduate students, and their doctoral dissertations from the Texas A&M Mechanical Engineering Department Dissertations are deposited in our institutional repository (OAKTrust) Creation and/or editing of Wikidata items for persons (faculty and thesis authors affiliated with A&M) and dissertations. Other possibilities: associated organizations, subject areas and disciplines. Will do it both manually and experiment with various tools for automating this. matching entities in Wikidata, reconciling entities and batch uploading data Will design SPARQL queries in Wikidata Query Service.
What We Have Done Until Now: Created drafts of application profiles for faculty advisors, graduate students (authors of dissertations) and the dissertations themselves Began to manually create and enhance items on Wikidata for these entities Began to experiment with methods of automating this process QuickStatements OpenRefine Wikidata extension Began to document these processes and workflows This is an ongoing effort as we refine them Documentation is on our team s internal Microsoft Teams web channel. Will share publicly when finalized
Consideration when drafting application profiles: What entities to create items for? Authors of dissertations, Faculty advisors, dissertations themselves But members of doctoral thesis committees? What properties to include? Gender to include or not to include? Wikidata policy says we should, but we can t always assume we know what it is. We reviewed: Existing application profiles from Stanford and University of Washington Guidelines for WikiProject Books on properties recommended for work vs. edition items What properties should be core ? Consulted with our colleagues in the Office of Scholarly Communications What questions were they interested in? What statistics would they like to gather? Looked at properties we tended to include in COMMON when creating Wikidata items
Properties of interest to Scholarly Communications: Year that someone earns degree: Gender and other demographic info Academic subject or discipline If student published with a professor Professional activity of students after they graduate Name changes for organizations Academic advisors
Uploading existing metadata about dissertations from our repository into Wikidata:
Screenshot of Wikidata schema in OpenRefine for faculty advisor
Screenshot of Wikidata schema in OpenRefine for author of a dissertation
Screenshot of Wikidata schema in OpenRefine for a dissertation
Example of Wikidata items batch created or enhanced through OpenRefine Jorge Alvarado (Q59297625) https://www.wikidata.org/wiki/Q59297625 Qibo Li (Q105752010) https://www.wikidata.org/wiki/Q105752010 Numerical Fluid Dynamics and Combustion Study of Emulsified Canola Oil Droplets in a Swirl Promoted Combustion Chamber https://www.wikidata.org/wiki/Q105752011
What we learned: Our institutional repository already has plenty of metadata that we can use to automate this process. If we can automate the batch creation of items on Wikidata, it could save us time from having to do each one manually Lots of dissertation titles are embargoed. Is it OK to put them on Wikidata? We had to batch create items first for entities before we could add statements to the items to create reciprocal links between them For example: Wikidata items for author and dissertation need to be created first. Then we needed to create a separate schema to add linking statements between them and run it separately
We can potentially automate most of these entities where the same faculty member advised multiple students
Next steps: Test how this can work in practice: Split it up among team members so everyone gets a chance to automatically create items on Wikidata Start small: work with dissertations downloaded from our repository that contain ORCID IDs for the authors Each team member will upload data for a single advisor and 2-3 associated ETDs and authors at a time Do the rest of the entities manually for advisors in this sample with only one student. See what extra information gets added that we can t automate Pull data from our local VIVO instance of faculty profiles and batch upload additional data as a separate step Last name, first name, positions, where they were educated at, degrees received, etc.
Other steps to take in the future: Explore other tools: Mix n Match Author Disambiguator Cradle Others? Design and carry out queries for the Wikidata Query Service Once we have a large enough sample of items on Wikidata
Questions? Contact: jaho@library.tamu.edu