Tools for Language Documentation - Overview and Principles

Slide Note
Embed
Share

Explore the tools and methodologies for documenting languages, including physical and conceptual tools, procedural techniques, and dissemination strategies. Understand the goals of language documentation and the principles involved in preserving linguistic data. Delve into the process of language documentation as a subfield of Documentary Linguistics, covering data collection, preservation, and analysis for theoretical and empirical linguistics.


Uploaded on Oct 09, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Week 1: Overview Tools for Language Documentation Claire Bowern Yale University LSA Summer Institute: 2013

  2. OVERVIEW, GOALS OF CLASS

  3. Tools for documentation Physical tools: Hardware Software Stimuli Conceptual tools: What makes a good documentary corpus Procedural tools: How to go about documenting a language Tools for disseminating results

  4. Overview Week 1: overview, hardware, software Week 2: elicitation techniques, grammar writing Week 3: narratives, conversation, corpus building Week 4: lexicon, archiving

  5. About the class How to describe/document a language *No practical component* (in that we won t be working with speakers) However, there will be time (I hope!) to talk about your own field data And we will be doing some exercises with existing data I will provide datasets for exercises (if you don t have data of your own to use) You can also use data from the field methods class here at the Institute.

  6. A few assumptions for this class Not talking about community-oriented materials here (I see documentary materials as feeding into that though) Assuming that the language doesn t have a lot of other materials apart from what the linguist will be producing Assuming that the linguist will be the one doing most of the writing. Implicitly assuming a grammar/dictionary/texts model (more on this below). None of these assumptions are crucial, they re just there so we can limit the topic a bit.

  7. PRINCIPLES OF DOCUMENTATION

  8. What is language documentation? Documentary Linguistics as its own subfield. Doing things with linguistic data: Getting the data Preserving it Processing it (Analyzing it) Cf Woodbury (2002): Language documentation is the creation, annotation, preservation, and dissemination of transparent records of a language. Important for both theoretical and empirical branches of linguistics: typology, historical linguistics, etc

  9. What shapes the language record? The linguist (i.e. you!) Their interests Their abilities The speakers and their interests! External circumstances funding time available lucky breaks unlucky breaks

  10. Language Documentation as a Language Legacy Particularly relevant for endangered languages. Your work might be the only substantive record of a language: few speakers field might view the language as done speakers might view the language as done

  11. Planned Documentation vs Collect it all making a record of the language : comprehensive grammar You can t collect everything. All documentation is sampling. Unstructured, unanalyzed corpora usually aren t very useful They are hard to use; They don t get worked on; They usually aren t big enough to test hypotheses computationally; They require native speakers (or people who are already very familiar with the language) -> fine for languages with a major presence, but what about the quarter of the world s languages with fewer than 10,000 speakers?

  12. What counts as documentation? When is a collection big enough to count as language documentation? Is an article in Linguistic Inquiry language documentation? creation annotation preservation dissemination but only a very small fragment of a language.

  13. How much time/space does a documentary corpus take? Depends on the resources: Time Speakers Money Levels of Interest

  14. Grammar, Dictionary, Texts The Boasian Trilogy Structure, Lexicon, Culture Way to present the analysis and also allow others to recreate it (or challenge it) from the underlying data. Conceived broadly: Capture language structure Capture language in use Capture lexicon and meaning

  15. Sampling: Documentation as snapshots A big part of documentation is constructing a good set of samples . To do that, you will need to consider what the purpose of the documentary record is. That is, why are you collecting data on the language? to make a lasting record of the language to reclaim the language to future speakers to write a reference grammar to document the culture in the traditional language to investigate a particular aspect of the language all of the above

  16. Sampling Are your snapshots representative? Speakers Subjects/Topics Grammatical constructions Lexicon

  17. Planned versus opportunistic collection Planned: translated sentences. grammaticality judgments etc. Unplanned (or planning gone wrong): Speakers reinterpret your prompts and construct narratives from them. New speaker comes to a session and wants to tell stories. You find a new (to you) morpheme in your data and want to find out how it works. You overhear a new construction in conversation.

  18. What constitutes a documentary corpus? ***Everything*** sound files videos transcripts (elicitation prompts part of the annotation) photographs maps (artifacts) metadata (data about the data) metametadata

  19. WORKFLOW AND DATA TYPES

  20. Workflow: 1. What do you need to do to document a language? 2. What order do you need to do it in? 3. (How will you know if it s been done right?)

  21. Scaled workflow Project as a whole (timescale of years) e.g. Bardi language documentation Immediate tasks (timescale of weeks or months) e.g. Bardi learners guide Subtasks (timescale of days or weeks) e.g. write the section on numbers Data gathering (timescale of single session) e.g. get data on numerals in use

  22. Workflow while on fieldwork

  23. HARDWARE

  24. Sample field kit: Equipment: Laptop Audio recorder Video recorder + microphones + backup means of recording (e.g. from laptop, second recorder) Media: backup devices [hard drive, DVDs, etc] memory cards for recorders paper! pens! Other ways of keeping the equipment clean carry bag stills camera (cell phone, ipad, etc) batteries, other power equipment tripod Stimuli/research prompts

  25. Audio The field has converged on solid state recorders using SD cards Handy Zoom H2 or H4 (or H6 coming soon!) Edirol R-09 Marantz PMD 660 or 670 And/or laptops (or laptop plus external sound card/preprocessor) small/portable AA batteries high quality, lossless formats easy to use easy to transfer data

  26. Not recommended: Dictaphones Cassette recorders DAT

  27. Video Less consensus on models Major component of the documentation or side-project? Options: smart phone ipad stills camera with video function dedicated video camera SD card mic jack Problems: mpeg vs other proprietary video formats large files memory-intensive

  28. Microphones headset vs lapel vs meeting microphone dynamic vs cardioid wired vs wireless SLR vs 1/8 jack The built-in mics in the Edirol, Handy, etc, are also ok You get what you pay for, approximately. Remember that microphone placement and volume monitoring is much more important than the quality of the microphone (far more recordings are ruined through the former than the latter).

  29. Computer Laptop Lots of memory Lots of hard drive space Usually don t need ruggedization features Get cheapest possible and assume it won t last for more than a season, or try for a higher end model Special considerations for high altitude, high humidity, or low temperature work. High altitude: hard drives fail: use solid state High humidity: condensation issues Low temperatures: battery issues (See Lanz 2010)

  30. Tablets? Most language software won t run on ipads or other tablets. Great for stimuli, backup recorder, camera, etc. Too much data

  31. Sample field kit: Equipment: Laptop Audio recorder Video recorder + microphones + backup means of recording (e.g. from laptop, second recorder) Media: backup devices [hard drive, DVDs, etc] memory cards for recorders paper! pens! Other ways of keeping the equipment clean carry bag stills camera (cell phone, ipad, etc) batteries, other power equipment tripod Stimuli/research prompts

Related


More Related Content