Tracing Locative Exponents in Historical Corpora

the fate of variant forms in historical corpora n.w
1 / 37
Embed
Share

Explore the evolution of locative exponents in historical corpora through linguistic change over time, observing the process of change following an S-curve pattern. Exceptions to this pattern are analyzed, focusing on Czech nominal paradigms in transition and the declensional system. The study highlights the dynamic nature of language evolution and the interplay of variant forms within historical contexts.

  • Linguistic change
  • Historical corpora
  • Locative exponents
  • Czech language
  • Language evolution

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. The fate of variant forms in historical corpora: Tracing locative exponents in DIAKON NEIL BERMEL & LUD K KNITTL UNIVERSITY OF SHEFFIELD SLAVICORP 2018, 25 SEPT.

  2. The natural course of historical change 1.

  3. Linguistic change through time the time course of the propagation of a language change typically follows an S-curve (Croft 2000: 183) What actually happens much of the time is more like slow, slow, quick, quick, slow . After the phase when the new form gains ascendancy rather rapidly, the process of change slows down again as the last remnants of the older state linger on. The result might look something like this: The whole thing can last hundreds of years altogether, indeed may never be wholly completed, but the bulk of the change is located within a much narrower slice of time where the slope is steeper. (Denison 2003)

  4. Exceptions to the s-curve principle To our knowledge there are no clearly documented cases of a change going toward completion that follows either a simple linear trajectory or an exponential curve (either slow start with a rapid completion and no tapering off, or an immediate rapid increase followed by a slow completion rate). There are, however, examples of variation that does not seem to be going toward completion, at least in the documented time period. These examples appear to exemplify either reasonably stable variation with the variants fluctuating around a mean percentage value, or a rise and fall of a relatively low frequency variant, commonly in competition with an incoming variant that is going to (near) completion on an S-shaped trajectory. (Blythe & Croft 2012: 280)

  5. Czech nominal paradigms in transition 2.

  6. The Czech declensional system Original state: Proto-Slavonic declension patterns based on thematic vowel of the stem or final consonant of the stem Modern Czech: based on grammatical gender of the lexeme + hardness/softness of the final stem consonant Along the way: collapse of old distinctions results in variation within paradigms, as loc. {u} replaces {e/ } in the hard masc. inanimate class

  7. An approximation of our question: {u}, {e/ } after loc. sg. adjectives - m/om + {u} ~ - m/om + {e/ } e u {u} form is known to be ascendant in this case (loc. sg.) {e/ } is receding from nouns of this class

  8. Data, methods, hypothesis 3.

  9. Diakon corpus of diachronic Czech 145m tokens Strongly skewed towards 19th-21st c. Graphic output in SyD (no concordances or further sorting) or KWIC in KonText (no graphic output after manual sort)

  10. Data = all lemmas in SYN2005* with: Some degree of loc. sg. variation (2+ exponents attested) Otherwise: search on [lemma=".*[bdfghklmnprstvxz]" & tag="NNIS6.*"] yields 930,035 tokens in 17,221lemmas . 1000+ loc. sg. tokens Otherwise:uninformative results in earlier periods . *SYN2005: a representative 100m-word-form corpus of contemporary Czech

  11. Graphic outputs in SyD: mid-frequency = uninformative results by historical period u a

  12. Lemmas searched in Diakon hrad pattern, Lsg (in order of decreasing frequency in SYN2005) 1. p pad 2. sv t 3. z klad 4. ivot 5. d m 6. byt 7. st l 8. z pas 9. les 18.jazyk 27. ad 10.st t 11.ostrov 12.proces 13.z pad 14.v chod 15.provoz 16.dv r 17.t bor 19.s l 20.kostel 21.obchod 22.are l 23.zp sob 24.den 25.dopis 26.z vod 28.led 29.okres 30.hrad 31.hlas 32.most 33.parlament 34.v z 35.p d 36.z kon 37.venkov 38. vod 39.ob d 40.obraz 41. asopis 42.obvod 43.bod 44.pap r 45.kout 46. stav 47.festival 48.koncert 49.p echod 50.kl n 51.po ad 52.h bitov

  13. Validating data from Diakon Morphological homonymy: {u}: loc. sg. but also dat. sg., gen. sg., voc. sg. (hlasu voice ) { /e}: loc. sg. but also voc. sg. (hlase voice ) Lexical homonymy: s le hall (s lLOC.SG.) or S le Saale river z pase match (z pasLOC.SG.) or striving (z pasitPRES.TRANS.) Sort using left and right contexts Assess each concordance line for inclusion/exclusion

  14. Method (1) Validated data is grouped into 50-year bins Proportion of each ending recorded in a timeline graph 38,826 contexts from Diakon (1300-1950); 191,259 contexts from SYN2005 (mostly 1950-2000) Problem (at right): Data still scarce No evidence of S curves anywhere pap e/pap ru 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% pap ru pap e

  15. Method (2) 1300-1800 collapsed into a single bin (sparse data) Assess direction of travel: 1. preserve original { /e} 2. preserve original {u} 3. maintain variation { /e}~{u} 4. shift {u} > { /e} 5. shift { /e} > {u} pap e/pap ru 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 1300-1800 1801-1850 1851-1900 1901-1950 SYN2005 pap ru pap e

  16. Hypothesis High-frequency items showing some variation between locative endings will shift toward the expansive {u} ending over time, with much change concentrated in one period. (Null hypotheses: There is no trend towards the expansive {u} ending, or there is a trend towards the recessive {e/ } ending, or at any rate the trend does not look like an S curve.)

  17. Results 4.

  18. Results (overall) 1. preserve original { /e} - 16 lexemes 2. preserve original {u} - 12 lexemes 3. maintain variation { /e}~{u} - 12 lexemes 4. shift {u} > { /e} - 12 lexemes 5. shift { /e} > {u} - 0 lexemes (!)

  19. Summary of results and directions Null hypothesis supported (strong form) Confounding factors? Possible reasons

  20. Confounding factors 5.

  21. Non-representative data Target words non-representative in two intentional ways: Display at least minimal loc. sg. variation in SYN2005 High frequency in SYN2005 Are they non-representative in any unintentional way? Phonological features favoring { /e}? Word-formational features favoring { /e}? Etymological features favoring { /e}? Semantic factors favoring { /e}? Syntactic factors favoring { /e}?

  22. 1. Phonological factors t cha 2009 (inter alia): stems ending in /h, g, f, k, ch, r, p, b, m/ take {u}, the first three without exception; stems ending in /d, t, n, s, z, l/ may have either { /e} or {u}, with { /e} being most likely to appear for the final three

  23. 1. Phonological factors (continued) stem Typical exponent ( t cha 2009) sometimes { } sometimes { } sometimes { } Our data: Maintain { /e} kostel, st l v z les Our data: Shift {u} > { /e} Our data: Maintain {u}~{ /e} Our data: Maintain {u} are l, festival provoz proces s l -l -z -s obraz asopis, dopis hlas, okres, z pas occasionally { } led, v chod, z klad, z pad st t -d hrad, ob d, p pad bod, obchod, ad, z vod koncert obvod, p d, po ad, p echod, vod parlament occasionally { } byt, kout, most, sv t, ivot -t occasionally { } kl n den, z kon -n predominantly {u} stav -v h bitov, ostrov, venkov predominantly {u} predominantly {u} predominantly {u} predominantly {u} -m -b -r -k d m zp sob pap r jazyk dv r, t bor

  24. 2. Word-formation and etymology Treat together: Word-formational tendencies said to affect native words (Grepl et al. 253, Cvr ek et al. 165) Etymological reasons: distinguish treatment of borrowed vs. native words (Petr et al. II:305, Cvr ek et al. 165)

  25. 2. Word-formation/etymology (cont.) Feature Expected outcome usually { } Our data: Maintain { /e} h bitov, ostrov, venkov Our data: Shift {u} > { /e} Our data: Maintain {u}~{ /e} Our data: Maintain {u} Native simplex + {ov} - 3 {u}, sometimes { } bod, jazyk Native simplex - 19 dv r, hrad, les, kout, led, most, ob d, st l, sv t, t bor, ivot kostel byt, den, d m, hlas, kl n, s l {u}, sometimes { } usually {u} st t pap r Nativized borrowing - 3 Native deverbal - 22 okres, v chod, z klad, z pad, z pas p pad, v z asopis, dopis, obchod, obraz, ad, z kon, z vod obvod, p d, po ad, provoz, p echod, vod, stav, zp sob are l, festival, parlament, proces almost always {u} koncert Borrowing - 5

  26. 3. Semantic factors Where a noun exhibits polysemy, { /e} = thing or place {u} = process (Grepl et al. 1995: 253; Cvr ek et al. 2010: 164)

  27. 3. Semantic factors (continued) Lexeme Diakon historical corpus of Czech Modern data (Diakon/SYN2005) obchod trade/shop {u} predominant for all meanings; { /e} occasional, in all meanings to end of 19th c. Early 20th c.: division between concrete ( shop ) and abstract ( trade ) meanings. p d fall/instance, (gramm.) case { /e} only 1x pre-mid-19th c.; else {u}, mostly fall , but sometimes case, instance . { /e} popular in late 19th/early 20th c., but {u} predominates in all meanings provoz operation/traffic no { /e} pre-1950; {u} inc. 2 exx. from early 20th c. meaning traffic . { /e} from SYN2005 in both meanings: traffic / operation p echod transit/crossing no { /e} in any earlier period { /e} covers all meanings. v chod exit/sunrise, east { /e} only 3x until 19th c., after which means east . {u} throughout, usu. exit or sunset , but also east into early 20th c.. SYN2005: shift in most common meaning towards { /e} 4x na v chodu {u} 1358x na v chod { /e} As above with v chod z pad turning/sunset, west As above with v chod

  28. 4. Syntactic factors Canonical locativity with v, na in, on increases the chances of { /e} appearing and being highly rated by speakers; non-canonical locative phrases (other prepositions, interposed adjectives) increase the occurrence and ratings of {u} (Bermel 1993, 2004) Were there more syntactic constructions favouring { /e} in later texts?

  29. 4. Syntactic factors: two case studies z pas match, struggle, clash Most popular context in all time periods is canonical v/ve in : Prior to 1800 appears only 2x, with {u}. 1800-1850: First example of { /e} 1850-1900: { /e} 53% of examples 1900-1950: { /e} 70% of examples Non-locative prepositions and non- canonical construction shapes as well also have { /e} 1800-1850: No examples 1850-1900: { /e} 62% 1900-1950: { /e} 76% led ice Some contextual shifts: PREP + ADJ + NOUN constructions prefer {u}, less of this in some periods. Gradual increase in { /e} forms for canonical na led ~ledu on the ice . pre-1800: { /e} 50% 1800-1850: { /e} 55% 1850-1900: { /e} 88% 1900-1950 { /e} 68%

  30. Summary of representativity Do confounding factors account for our data? Phonological factors: no obvious overrepresentation of certain phonological environments (overall % of these environments for the class?) Word-formational/etymological factors: simplexes well represented; still, results overall lean more towards { /e} Semantic factors: historical data yield no support for this (shift applies for all meanings) Syntactic factors: some evidence of specialisation by context, but a late appearance and not consistent (shift otherwise applies in all contexts) Hence: Confounding factors do not explain the trend towards { /e} We are left with frequencyas an explanation .?

  31. Explainng unexpected results 6.

  32. Three possibilities 1. Our method was not successful 2. Certain types of data were foregrounded 3. Other factors have interfered Sociolinguistic? Cognitive?

  33. Background for a new hypothesis For all Czech nouns, dat. sg. >< loc. sg. noun forms Differentiation by syntactic environment and adjective: d ky tomuto zp sobu >< v tomto zp sobu (masc. inan.) ktomuto p novi >< o tomto p novi (masc. anim.) k t to holce >< ot to holce (fem.) k tomuto leti ti >< na tomto leti ti (masc./neut. soft)

  34. New hypothesis Semantic/syntactic load of noun case form decreases {u} ~ {u} = increase in load elsewhere (higher-level constructions) Entrenchment of forms where older distinctions between dat. and loc. are observed ({u} vs. { /e}): entrenchment associated with repetition = frequent items more likely to undergo this process lack of adjective, canonical phrases of location

  35. Recessive {e/ } Expansive {u}

  36. Retextualization? Variation when balanced in a complex system can evolve slowly and in various directions at once, rather than speeding towards completion (Nichols & Timberlake 1991)

  37. References Bermel, N., 1993, S mantick rozd ly v tvarech esk ho lok lu [Semantic differences in forms of the Czech locative], Na e e 76, p. 192-198. Bermel, N.. 2004, V korpuse nebo v korpusu? Co n m ekne (a ne ekne) NK o morfologick variaci v tvarech lok lu [V korpuse or v korpusu? What the Czech National Corpus can (and can t) tell us about variation in the locative singular], in Z. Hladkov , P. Karl k (ed), e tina univerz lia a specifika [The Czech language universals and specifics], Prague, Nakladatelstv Lidov noviny, p. 163-171. Bermel, N. & L. Knittl, 2012a, Morphosyntactic variation and syntactic constructions in Czech nominal declension: corpus frequency and native-speaker judgments, Russian Linguistics 36, p. 91 119. Bermel, N. & L. Knittl, 2012b, Corpus frequency and acceptability judgments: A study of morphosyntactic variants in Czech, Corpus Linguistics and Linguistic Theory 8, p. 241 275. Blythe, R.A. & W. Croft, 2012, S curves and the mechanisms of propagation in language change, Language 88, p. 269-304. Brown, D., 2007, Peripheral functions and overdifferentiation: The Russian second locative, Russian Linguistics 31, p. 61 76. Bybee, J., 2006, From usage to grammar: the mind s response to repetition, Language 82, p. 711 733. Croft, W., 2000, Explaining Language Change: An Evolutionary Approach, Harlow, Longman. Cvr ek, V. & P. Vond i ka, 2011, SyD - Korpusov pr zkum variant, Prague, stav esk ho n rodn ho korpusu FF UK, http://syd.korpus.cz Cvr ek, V., V. Kod tek, M. Kop ivov , D. Kov kov , P. Sgall, M. ulc, J. T borsk , J. Vol n, M. Waclawi ov , 2010, Mluvnice e tiny I [A Grammar of Czech I], Prague, Nakladatelstv Karolinum. Denison, D., 2003, Log(ist)ic and simplistic S-curves, in Hickey, R. (ed.), Motives for Language Change, Cambridge, Cambridge University Press, p. 54-70. Grepl, M., Z. Hladk , M. Jel nek P. Karl k, M. Kr mov , M. Nekula, Z. Rus nov & D. losar, 1995, P ru n mluvnice e tiny [Handbook Grammar of Czech], Prague, N kladatelstv Lidov noviny. Janda, L., 1996, Back from the brink: A study of how relic forms in language serve as source material for analogical extension, Munich and Newcastle, Lincom Europa. Ku era, K., A. eho kov & M. Stluka, 2015, DIAKORP diachronic corpus, version 6, http://www.korpus.cz, Prague, Czech National Corpus Institute, Faculty of Arts and Philosophy, Charles University. Lamprecht, A., D. losar & J. Bauer, 1986, Historick mluvnice e tiny [A Historical Grammar of Czech], Prague, St tn pedagogick nakladatelstv . Meillet, A., 1965, Le Slave commun, Paris, Librairie Honor Champion. Nichols, J. & A. Timberlake, 1991, Grammaticalization as retextualization, in E. C. Traugott & B. Heine (eds.), Approaches to Grammaticalization, vol. I: Focus on Theoretical and Methodological Issues, Amsterdam and Philadelphia, John Benjamins, p. 129 146.

Related


More Related Content