Community-Led Data Repositories in Paleoecology and Paleoclimatology
Facilitating the assembly of individual paleorecords into larger networks, community-led data repositories play a crucial role in the paleogeosciences. By interconnecting geoscientific users and geoinformatics, these repositories enable the exploration of big questions related to global temperatures, CO2 dynamics, species migration, and more. With a focus on long-tail paleoecological data, these repositories support the scientific community in curating and managing valuable datasets for further analysis and research.
Uploaded on Oct 01, 2024 | 0 Views
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Community-Supported Data Repositories in Paleoecology and Paleoclimatology: The Middle Tail between Geoscientific Users and Geoinformatics Jack Williams, Allan Ashworth, Brian Bills, Jessica Blois, Don Charles, Simon Goring, Russ Graham, Eric Grimm, Alison Smith, & Mark Uhen Part I: Building the Middle Tail: Community-Led Data Repositories Part II: Interconnecting the Middle Tail: Cyberinfrastructure for the Paleogeosciences C4P Neotoma DB www.neotomadb.org
Many Big Questions require assembly of individual paleorecords into larger networks Global temperatures & CO2: 22ka->0ka Do global temperatures lead or lag CO2 during deglaciations? How far and fast can species migrate when climates change? Shakun et al. (2012) Nature Spruce distributions: last glacial maximum to present Spruce Pollen 21,000 15,000 11,000 Modern 7,000 % % Ice Ice Ice % % No Data Williams et al. (2004) Ecological Monographs
Paleoecological Data: Key characteristics Long Tail : Collected in the field by small scientific teams. Scientists vary w.r.t. data management expertise, capacity, interest Data Size Big Data Long Tail Highly valuable: specimens & samples collected decades ago are still analyzed Datasets Distributed scientific expertise: by proxy type, region, time period, and/or taxonomic group C4P Neotoma DB www.neotomadb.org
Solution: Community-Led Data Repositories (COLDARs) as middle tail for long-tail data Key Characteristics Open Data Curated by Community Added Value by serving community-specific needs (e.g. age models, taxonomy) Neotoma DB www.neotomadb.org Paleobiology DB paleobiodb.org
Moving up the Value Chain: Generic Depositories vs. Community-Led Repositories BIG DATA interoperable harmonized, community governance & input re-usable Generic Depositories context, provenance Community- Led Repositories accessible authorization, protocols data have no value or meaning in isolation; they exist within a knowledge infrastructure an ecology of people, practices, technologies, institutions, material objects, and relationships. - C.L. Borgman findable identification, persistence small data Neotoma DB www.neotomadb.org Modified from K. Lehnert
Neotoma Paleoecology Database: Community- Led Repository for Quaternary and Pliocene Data Design Concepts Spatiotemporal Database: species occurrences & abundances in space & time Age Controls and Age Models stored Centralized IT and Distributed Scientific Governance Neotoma composed of several constituentdatabases (e.g. North American Pollen Database, FAUNMAP) Open Data accessible via Explorer, APIs, R Neotoma Broad User Community: Paleoecologists, ecosystem modellers, paleoclimatologists, biogeographers, educators, Neotoma DB www.neotomadb.org
Temporal Domains of Paleoecological Databases Neotoma Domain Time: Late Neogene (~last 5 million years) Most records: 104-105 yrs Space: North American to Global Paleoecological Data Plants & pollen Vertebrates Ostracodes Diatoms Insects Testate Amoebae Physical Sedimentology Neotoma DB www.neotomadb.org Brewer et al. 2012 TREE
Neotoma Uploads, Citations, and Usage Recent uploads to Neotoma Pubs Citing Neotoma & Constituent DBs 2014 Usage Statistics Neotoma Explorer: 1,918 unique users Neotoma APIs: 1,562 unique users Neotoma APIs: 241,469 requests Neotoma DB www.neotomadb.org Last updated: July 2015
Data Search & Retrieval Neotoma Software Ecosystem Neotoma Explorer Exists Data Preparation & Submission In Development Tilia Data Archival APIs Neotoma DB Data Submission Web Application neotoma (R) Downloadable Database Snapshots Data Exploration & Visualization Niche Viewer Ice Age Mapper Stratigraphic Diagrams Explorer
Neotoma Governance (Proposed) Users & Informaticists Training Workshops Executive Team Grimm, Williams + 1 more Paleobiological Data Consortium Neotoma Leadership Council Grimm, Williams, Bills + 1 Developer & 3 Data Stewards Developer Team Bob Booth Data Stewards Amoebae Don Charles, Sonja Hausmann Diatoms Bills (lead) Anderson Buckland Davis Goring Grimm Roth Williams Ashworth, Buckland, Punel Insects Betancourt, Holmgren, Latorre, Rylander Alison Smith, Brandon Curry Grimm, Bradshaw, Giesecke, Williams, Goring, Evans, Fletcher, Hopf, Markgraf, McGeever, Mitchell Middens Ostracodes Pollen Plant Macros Bob Booth Graham, Blois, Davis, Barnosky, Colburn, Etnier, Jacisin, Maguire, Milideo, Smith, Warren Jon Nichols Vertebrates Biomarkers Suzanne Pilaar Birch, Chris Widja Isotopes Neotoma DB www.neotomadb.org Taphonomy Josh Miller, Russ Graham
Next Challenge: Organizing and Interconnecting the Middle Tail C4P CINERGI Catalog: 224 Databases, 23 with geologic time metadata C4P CINERGI http://pivots2.azurewebsites.net/c4p.html#pv-file-selection
EarthCube RCN: Cyberinfrastructure for Paleobioscience (C4P) Goals Build new partnerships and collaborations among geoscientists and technologists Survey and catalog existing resources Share news of the latest advances in cyberscience and paleogeoinformatics Facilitate development of common standards and semantic frameworks C4P
EarthCube RCN: Cyberinfrastructure for Paleobioscience (C4P) Activities Webinars & YouTube Channel: https://www.youtube.com/user/cybe r4paleo CINERGI Catalog of paleoresources (databases, software, etc.) http://earthcube.org/content/cinergi- c4p-resource-viewer C4P Paleobiology Workshop (May 2014) Geochronology Workshop (Oct 2014) Early Career Workshops GSA 2014, 2015 New Initiatives: Paleobiological Data Consortium (Neotoma/PBDB/ , PBDB-iDigBio, Open Core Data (CDSCO/IEDA/Neotoma/ )
PALEOBIOLOGICAL DATA CONSORTIUM BIODATA DarwinCore Share best practices & protocols Build compatibility between geo- & bioinformatics VertNet GBIF/BISON iDigBio COMMUNITY GEODATA Paleobiology DB NOAA Paleoclimatology Neotoma DB STEPPE NOW DB Continental Scientific Drilling Office (CDSCO) Early Career Members- at-Large iDigPaleo Digimorph Integrated Earth Data Alliance MorphoBank OPEN-SOURCE ROpenSci Open Geospatial Consortium C4P
Current & Future Neotoma, C4P, & PDC Activities 1. Data Uploads (Neotoma; e.g. MIOMAP, Mexican Quaternary Mammal DB, ongoing) 2. All Hands Neotoma Workshop at AGU (Neotoma; Dec 2015) 3. One-Stop Queries for Neotoma & Paleobio DBs (Harmonized APIs & R packages) (PDC, ongoing) 4. Hackathon for Paleobiological Data (C4P; Summer 2016, invitations TBD!) 5. New tools for data visualization & exploration (Neotoma Taxa Mapper & Niche Viewer) C4P PDC Neotoma DB www.neotomadb.org
Sounds great! Whats in it for me? 1. Interested in using Neotoma to archive your data and make it available to others? Catch me after session Talk to a Data Steward WebEx training for new Stewards 2. Interested in using Neotoma & other paleobio resources? Neotoma Explorer walkthrough exercise: http://serc.carleton.edu/neotoma/activities.html neotoma (R) paper (Goring et al. 2015 Open Quaternary) User workshops: ESA2016, IBS2017 Hackathon Summer 2016 3. Interested in integrating your resource (software/DBs) to Neotoma & other paleobio resources? Catch me after session Hackathon Summer 2016 C4P PDC Neotoma DB www.neotomadb.org
This talk represents the work of many Neotoma DB NSF-Geoinformatics Neotoma PIs & Developers: Eric C. Grimm, Russ Graham, Mike Anderson, Allan Ashworth, Brian Bills, Jessica Blois, Bob Booth, Ed Davis, Don Charles, Simon Goring, Steve Jackson, Alison Smith, Jack Williams Eric Grimm C4P NSF-Earth Cube C4P RCN Steering Committee: Kerstin Lehnert, David Anderson, Doug Fils, Leslie Hsu, Chris Jenkins, Anders Noren, Tom Olsewski, Dena Smith, Mark Uhen, Jack Williams Paleobio Data Consortium NSF-Earth Cube Paleobiological Data Consortium: Mark Uhen, Jack Williams, Brian Bills, Jessica Blois, Ed Davis, Simon Goring, Russ Graham, Michael McClennen, Shanan Peters, Alison Smith