Insights into Long-Term Archiving Challenges for Large Corpora
Delve into the complexities of preserving and managing large corpora data through Signposts for CLARIN and related resources. Explore topics such as data alterations, format conversions, and virtual collection frameworks. Further reading offers in-depth insights from experts on addressing challenges in long-term archiving, providing valuable knowledge for researchers and practitioners in the field.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Signposts for CLARIN Denis Arnold, Bernhard Fisseni, and Thorsten Trippel Resource 5 KM Data File * 2018 2019
CLARIN Open Science FAIR CMDI Virtual Language Observatory (VLO) Virtual Collection Registry (VCR)
CMDI Collection A Collection B
Virtual Collection Collection A Collection B Virtual Collection
Data changes Legal Preservation Data has to be altered or even deleted Data has to be converted to new formats
Virtual Collection Collection A Collection B Virtual Collection
Signpost Collection A Collection A
Virtual Collection Collection A Collection B Virtual Collection
Resource 5 KM Data File * 2018 2019
Further reading Arnold, Denis, Bernhard Fisseni, Pawe Kamocki, Oliver Schonefeld, Marc Kupietz, and Thomas Schmidt(2020). Addressing Cha(lle)nges in Long-Term Archiving of Large Corpora . In: Proceedings of the LREC 2020 Workshop Challenges in the Management of Large Corpora (CMLC-8). Marseille, France.
Thanks! Oliver Schonefeld Pawe Kamocki Marc Kupietz Thomas Schmidt