Managing Ambiguity in Language: An Innovative Approach
Language presents challenges due to its inherent ambiguity, making it difficult to map between form and meaning. This content discusses an architecture for structured ambiguity management, emphasizing the complexities of linguistic interpretation, modular mapping from text to meaning, and the pervasive nature of ambiguity across various linguistic levels.
- Ambiguity Management
- Linguistic Interpretation
- Modular Mapping
- Structural Semantics
- Language Complexity
Uploaded on Sep 12, 2024 | 0 Views
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
An Architecture for Structured Ambiguity Management Ronald M. Kaplan
Difficult to map between form and meaning Languages are hard to describe Meaning depends on complex properties of words and sequences The chicken is eager to eat The chicken is easy to eat Chicken is hungry Someone can eat the chicken John looked up it John looked it up Unpredictable interactions Different languages rely on different properties John used a periscope John used a phone book Languages are hard to compute Expensive to recognize complex patterns Patterns overlap The chicken is ready to eat Ambiguities multiply: explosion in time and space
Modular mapping from text to meaning Input Text text Constituents Parsing Dependents A nearly-decomposable system (Simon, 1962) Evolution modularity with relatively minor interactions Scientific understanding: complexity is only apparent Effective engineering: module can be developed, tuned, maintained separately Structural semantics Semantics Conceptual semantics Discourse Pragmatics Contextual interpretation Inference, command/control, semantic search, sensible dialog, etc. Knowledge representation
Modular mapping from text to meaning Input Text Module-specific representations best encode local properties text Constituents Parsing Kaplan&Bresnan 1982 Dependents some( x. suspect (x), y. ( e. talk (e,y) & past(e))) Structural semantics subtype(Talk1, InformationTransferEvent) subtype(Suspect2, SuspectedCriminal) sub-type(Info3, InformationObject) role(performedBy, Talk1, Suspect2) role(infoConveyed, Talk1, Info3) Semantics Conceptual semantics ist(t, isa(talk1, InformationTransferEvent)) ist(t, isa(suspect2, Agent)) ist(t, isa(info3, InformationObject)) ist(t, performedBy(talk1, suspect2)) ist(t, infoConveyed(talk1, info3)) ist(t, isa(agent4, Agent)) ist(t, suspects(agent4, susp5)) ist(susp5, isa(crime6, CriminalAct)) ist(susp5, performedBy(crime6, suspect2)) Inference, command/control, semantic search, sensible dialog, etc. Condoravdi et al 2003 Contextual interpretation Discourse Pragmatics Knowledge representation
Ambiguity is pervasive Tokenization Entailment Discourse Lexicon Syntax Semantics Bill fell. John kicked him. because or after? him =John or Bill John didn t wait to go. now or never? Every man loves a woman. The same woman or each their own? John told Tom he had to go. Who had to go? The chicken is ready to eat. Cooked or hungry? walks Noun or Verb untieable knot (untie)able or un(tieable)? bank? General Mills? river or financial? person or company I like Jan. |Jan|.| or |Jan.|.| (sentence end or abbreviation)
Coverage vs. Ambiguity I fell in the park. + I know the girl in the park. I see the girl in the park.
Ambiguity can be explosive If alternatives multiply within or across components Morphology Knowledge Semantics Tokenize Syntax But: best module-internal result may depend on later information Watch [a movie with Tom Cruise] vs. Watch [a movie] with Sally Jones actor (IMDB) friend (contacts) Local ambiguities need global resolution
Ambiguity strategies General techniques to control computation Prune: discard unlikely/inconsistent candidates early Procrastinate: underspecify, expand when needed Eliminate boundaries: merge separate constraint systems Manage: pack representations to share common computations Special techniques can apply within modules Make use of particular structures and methods Word lattice, parse chart, Viterbi But what happens at module interfaces? Follow-on modules interpret special ambiguity structures of earlier components? Full enumeration? A separate meta-architecture for managing ambiguity?
Pruning Premature disambiguation Typical approach: Use heuristics to kill as soon as possible Oops: Late hard constraints may reject the so-far-best (= only) option Statistics X Speech, Tokens X X Morphology Knowledge Semantics X Syntax X Fast computation, wrong result
Resolving information is unpredictable Nonlinguistic context: The chicken is ready to eat . (cf. eager/easy) Kitchen Barnyard Ontological/lexical accidents: The veal is ready to eat The calf is ready to eat
Procrastination: Passing the buck Chunk parsing as an example: Collect noun groups, verb groups, PP groups Underspecify relations Some extensions are nonsense Leave it to later processing to combine and filter Later processing must either: Call (another) parser to check constraints Apply its own model of constraints (= grammar) Solve unevaluated constraints left in chunker output Module independence implicitly weakened
Eliminate module boundaries Few interactions anything goes free-for-all Give up on (near) decomposability: Scientific pessimism Engineer in unstructured cross-product space Mantra of end-to-end (deep) learning Correlation/perception vs. interpretation/cognition No relation to previously acquired knowledge Coarse, sparse data lowered ambitions
Ambiguity management: Pack and share Speech word-lattice (US presidents) 7 Jack's son 3 John's son Andrew and Andrew 6 Jackson 0 1 2 4 5 Johnson Separate choices computed independently Concatenating any subpath into a node with any subpath out of that node gives a good complete result Exponentially many paths, but each can be read out in linear time If next module takes lattice input, no need to enumerate at interface Otherwise, revert to simple strings: Andrew John s son and Andrew Jack s son. Andrew John s son and Andrew Jackson. Andrew Johnson and Andrew Jack s son. Andrew Johnson and Andrew Jackson.
Pack and Share Context-free parse chart: Any structure over the beginning of the string is compatible with any following structure The old men and women saw the girl with the telescope [Old men] and women Old [men and women] [the girl] [with the telescope] [the girl [with the telescope]] x saw x saw is shared and choice doesn t depend on PP choice, so not unpacked in syntax Bet on independence, not inconsistency: nearly decomposable
Bet on independence: Free choice How many fish? How many sheep? The fish ate the sheep. Options multiplied out The fish-sg ate the sheep-sg. The fish-pl ate the sheep-sg. The fish-sg ate the sheep-pl. The fish-pl ate the sheep-pl. In principle, a verb might require agreement of Subject and Object: Have to check it out. Options packed sg sg pl English doesn t do that: Subparts are independent The fish ate the sheep pl Packed representation is a free choice system Encodes all known dependencies without loss of information Common items represented, computed once Exponential candidates with linear per-candidate read-out Key to modular efficiency
Dependent choices Das M dchen sah die Katze nom acc The girl saw the cat nom acc Maxwell & Kaplajn 1991 Again, packing avoids duplication incorrectly Doesn t encode all dependencies, choices are not free (LFG: Predicates are unique) Das M dchen-nom sah die Katze-nom Das M dchen-nom sah die Katze-acc Das M dchen-acc sah die Katze-nom Das M dchen-acc sah die Katze-acc bad The girl saw the cat The cat saw the girl bad Who do you want to stop? I want to stop [who] I want [who] to stop want intrans, stop trans want trans, stop intrans
Solution: Label dependent choices Das M dchen-nom sah die Katze-nom Das M dchen-nom sah die Katze-acc Das M dchen-acc sah die Katze-nom Das M dchen-acc sah die Katze-acc bad The girl saw the cat The cat saw the girl bad (p q) ( p q) p:nom q:nom q:acc = Das M dchen sah die Katze p:acc Label each choice with distinct Boolean variables p, q, etc. Record acceptable combinations as a Boolean expression Each analysis corresponds to a satisfying truth-value assignment Free choice from the true lines of s truth table
Ambiguity as (meta)disjunction walk+ s [noun plural] [verb singular-subject] sheep noun [singular plural] ready: subject = complement subject complement object hungry cooked Ambiguity resolution reduces to Boolean satisfiability The sheep [subject [singular plural]] walks [verb [noun plural verb singular-subject]] subject singular verb singular-subject (Meta: Ed believes the chicken is ready to eat Ed believes cooked or hungry)
Boolean satisfiability Can solve Boolean formulas by multiplying out: Disjunctive Normal Form (a e c) (a e d) (b e c) (b e d) (a b) e (c d) a, b, c, d, e are facts in a base-theory : strings, trees, graphs, formulas Produces simple conjunctions of literal propositions Easy checks for satisfiability If a d FALSE, replace any conjunction with a and d by FALSE. Blow-up of disjunctive structure before fact processing Individual facts are replicated (and re-processed): Exponential time
Alternative: Contexted normal form (a b) e (c d) p a p b e q c q d context fact e p q p q p a p b q c q d e a b c d Produce a flat conjunction of contexted base-theory facts Label each fact with its position in the disjunctive structure Discard Boolean hierarchy No blow-up, no duplicates Each fact appears and can be processed once Claims: Checks for satisfiability still easy Facts can be processed first, disjunctions deferred A conjunctive normal form, computationally convenient
A sound and complete method Contexted form is logically equivalent, supports valid inferences. Lemma: iff p p (p a new Boolean variable) Proof: (If) If is true, let p be true, in which case p p is true. (Only if) If p is true, then is true, in which case is true. Conversion of inference rules (by trivial logic): If is a rule of inference (in module s base theory), then (C1 ) (C2 ) (C1 C2 ) is also valid (C1 and C2are Boolean combinations of p s, q s) E.g. Substitution of equals for equals: x=y x/y is a rule of inference Therefore: (C1 x=y) (C2 ) (C1 C2 x/y)
Example The sheep walks Disjunction of NUM feature from sheep (f SUBJ NUM)=SG (f SUBJ NUM)=PL Contexted facts: p (f SUBJ NUM)=SG p (f SUBJ NUM)=PL (f SUBJ NUM)=SG (from walks) Inferences: p (f SUBJ NUM)=SG (f SUBJ NUM)=SG p SG=SG p (f SUBJ NUM)=PL (f SUBJ NUM)=SG p PL=SG p FALSE p FALSE is true iff p is false iff p is True. Conclusion: Sentence is grammatical in context p: Only 1 sheep
Test for satisfiability Suppose C FALSE is deduced from a contexted formula . Then is satisfiable only if C. E.g. C SG=PL FALSE. C is called a nogood context. cf. ATMS Efficiency: Boolean solving only of conjunction of nogoods, nogoods only when fact-inferences deduce FALSE. Implicitly notices independence/context-freeness: Don t deduce FALSE from independent facts.
Model building Substitute p,q free-choice values from true lines of s truth table into constraint contexts Extract and interpret conjunction of constraints labeled by T p q (p q) ( p q) p:nom q:nom q:acc = T T F Das M dchen sah die Katze p:acc T F T F T T F F F T:nom F:nom T:acc Das M dchen sah die Katze F:acc girl saw cat F:nom T:nom F:acc Das M dchen sah die Katze T:acc cat saw girl
The wager (for mappings of real sentences of real languages) FALSE is unlikely: far fewer interactions than there might be k inconsistencies in a system of n constraints, where k << n Complexity is 2k instead of2n Alternatives from distant choice-sets can be freely chosen without affecting satisfiability Ambiguity management optimizes for independent constraints No FALSE no nogoods nothing to solve. Computational efficiencies (when we win the bet) Apply all local constraints: no procrastination Maintain all options: prunes only for known inconsistencies Downstream modules inherit common constraints No enumeration at module boundaries
Ambiguity-enabled composition: Choice-logic and context-space is common to all modules If is a rule of inference, then so is C1 C2 (C1 C2) 1. Substitution of equals for equals (e.g. for syntactic constraints) x=y x/y Therefore: C1 x=y C2 (C1 C2) x/y 2. Reasoning Cause(x,y) Prevent(y,z) Prevent(x,z) Therefore: C1 Cause(x,y) C2 Prevent(y,z) (C1 C2) Prevent(x,z) 3. Log-linear disambiguation (PCFG algorithms on and-or tree) Prop1(x) Prop2(x) Count(Featuren) Therefore: C1 Prop1(x) C2 Prop2(x) (C1 C2) Count(Featuren) Ambiguity-enabled components propagate choices across module boundaries, can defer choosing, enumerating.
Syntactic packing preserved in semantics P Ed saw the girl with the telescope. Crouch 2005 P (context t) (instantiable Ed23 t) (instantiable girl22 t) (instantiable see_ev20 t) (instantiable telescope21 t) (role see_ev20 perceivedThings girl22) (role see_ev20 performedBy Ed23) (subconcept Ed23 Person) (subconcept girl22 FemaleChild) (subconcept see_ev20 VisualPerception) (subconcept telescope21 Telescope) (temporalRel startsAfterEndingOf Now see_ev20) P:(role see_ev20 deviceUsed telescope21) P:(role girl22 possesses telescope21) No enumeration
A general architecture for ambiguity If is a rule of inference, then so is C1 C2 (C1 C2) Suppose you have a computation in some base theory Comp( , ) Then an ambiguity-enabled wrapper can be defined to operate on context- argument pairs: Comp-AE(<C1, >, <C2, >) <Conj(C1,C2), > XLE LFG parser + semantic interpreter includes a family of Boolean context-operations, maintains an internal and-or tree, solves for nogoods, disambiguates, etc.
Stochastic Disambiguation All packed results ranked list Discriminative ranking (Riezler et al. 02, Kaplan et al. 04) Conditional log-linear model on structure-features Discriminative estimation from partially labeled data Combined L1-regularization and feature-selection Efficient feature-extraction from packed AND/OR-forest of ambiguity management Dynamic programming for estimation and disambiguation Unpacks only as needed for feature dependencies (Miyao and Tsujii 02) Similar to inside-outside algorithm, but over choice tree Disambiguation cost: ~5%
Summary Sophisticated language-based capabilities depend on pipelines that map from text to meaning-representations Natural dialog, command and control of devices, semantic search, etc. Pipelines will involve independently developed/trained modules that impose different kinds of constraints on different kinds of representations for development, maintenance, accuracy Ambiguity within and between modules is a key computational issue Architecture for packing structures/sharing computations sound and complete with respect to currently known constraints easy to embed in existing implementations preserves ambiguity across module boundaries, typically without interface enumeration of alternatives Global resolution of ambiguity while preserving modularity