As Make Data Count (MDC) continues to focus on expanding the quality and scope of the Data Citation Corpus, the hackathon will be focused on two projects that will contribute to the long term expansion and refinement of the corpus. Work from the hackathon’s two projects will be presented the following day at the Summit by contributors and MDC advisors.
Project 1 “Go Go Inspector Corpus!”: Hacking up prototypes for front end search, reporting, and investigation interfaces for the corpus that will serve as inspiration for future iterations. This UI-focused project requires data scientists, UX and front-end developers, and web database engineers interested in prototyping projects, dashboards, visualizations based on the corpus that can be built on by Make Data Count and included within the corpus dashboard.
Project 2 “Who you gonna call, CorpusBusters!”: Create community curation workflows for the corpus. This project aims to build workflows for updating and expanding the corpus' data citations. Work will be focused on the development of processes for tracking, ingesting, validation, and enrichment of user-submitted corpus entries, made both individually and in bulk. The project team requires engineers confident in building Github Action and API-based workflows, as well as data pipelines more generally.
Coffee, snacks & lunch will be provided at each locationWellcome Trust (215 Euston Road, London) 10a-3p (BST)
California Digital Library (1100 Broadway St, Oakland CA) 10a-3p (PST)
Data Citation Corpus File:https://doi.org/10.5281/zenodo.11196858**Please download the file ahead of the hackathon
Sample enriched metadata: download here
**Optional download: Jennifer created this sample in prep for her own mini-hackathon in which she created these dashboards for a funder and institution
Please also come prepared with your laptop (charger) and we suggest familiarizing with Colab