Insights into Long-Term Archiving Challenges for Large Corpora

 
Signposts for CLARIN
 
Denis Arnold, Bernhard Fisseni, and Thorsten Trippel
Resource 5 KM
D
a
t
a
 
F
i
l
e
*
 
2
0
1
8
† 2019
 
CLARIN
 
Open Science
FAIR
CMDI
Virtual Language Observatory (
VLO
)
Virtual Collection Registry (
VCR
)
 
CMDI
 
Collection A
 
Collection B
 
Virtual Collection
 
Collection A
 
Collection B
 
Virtual Collection
 
Data changes
 
Preservation
 
Data has to be
converted  to
new formats
 
Legal
 
Data has to be
altered or
even deleted
 
Virtual Collection
 
Collection A
 
Collection B
 
Virtual Collection
 
Signpost
 
Collection A
 
Collection A
 
Virtual Collection
 
Collection A
 
Collection B
 
Virtual Collection
Resource 5 KM
Data File
* 2018
† 2019
 
 
Further reading
 
Arnold, Denis, Bernhard Fisseni, Paweł Kamocki, Oliver
Schonefeld, Marc Kupietz, and Thomas Schmidt(2020).
‘Addressing Cha(lle)nges in Long-Term Archiving of Large
Corpora’. In: 
Proceedings of the LREC 2020 Workshop ‘Challenges in
the Management of Large Corpora’ (CMLC-8)
. Marseille, France.
 
Thanks!
 
Oliver Schonefeld
Paweł Kamocki
Marc Kupietz
Thomas Schmidt
Slide Note

Embed
Share

Delve into the complexities of preserving and managing large corpora data through Signposts for CLARIN and related resources. Explore topics such as data alterations, format conversions, and virtual collection frameworks. Further reading offers in-depth insights from experts on addressing challenges in long-term archiving, providing valuable knowledge for researchers and practitioners in the field.

  • Long-Term Archiving
  • Large Corpora Data
  • Data Preservation
  • Virtual Collection
  • Research Resources

Uploaded on Sep 12, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Signposts for CLARIN Denis Arnold, Bernhard Fisseni, and Thorsten Trippel Resource 5 KM Data File * 2018 2019

  2. CLARIN Open Science FAIR CMDI Virtual Language Observatory (VLO) Virtual Collection Registry (VCR)

  3. CMDI Collection A Collection B

  4. Virtual Collection Collection A Collection B Virtual Collection

  5. Data changes Legal Preservation Data has to be altered or even deleted Data has to be converted to new formats

  6. Virtual Collection Collection A Collection B Virtual Collection

  7. Signpost Collection A Collection A

  8. Virtual Collection Collection A Collection B Virtual Collection

  9. Resource 5 KM Data File * 2018 2019

  10. Further reading Arnold, Denis, Bernhard Fisseni, Pawe Kamocki, Oliver Schonefeld, Marc Kupietz, and Thomas Schmidt(2020). Addressing Cha(lle)nges in Long-Term Archiving of Large Corpora . In: Proceedings of the LREC 2020 Workshop Challenges in the Management of Large Corpora (CMLC-8). Marseille, France.

  11. Thanks! Oliver Schonefeld Pawe Kamocki Marc Kupietz Thomas Schmidt

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#