Understanding Natural Language Generation (NLG) Process

Slide Note
Embed
Share

Natural Language Generation (NLG) is the process of constructing natural language outputs from non-linguistic inputs. It involves generating text from machine representations to meet specific communicative goals. NLG is distinct from Natural Language Understanding (NLU) as it maps meaning to text, while NLU maps text to meaning. This process requires knowledge of the target language and domain to produce coherent and meaningful text. Various definitions and examples illustrate the significance and application of NLG in generating reports, explanations, and other textual outputs.


Uploaded on Sep 28, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. NLG

  2. Agenda Natural Language Generation (NLG) Generation Steps Natural Language Processing (NLP) by Rahman Ali, Lect: QACC, UOP 2 28 September 2024

  3. Definition 1: NLG Natural Language Generation (NLG) is the process of constructing natural language outputs from non-linguistic inputs. Goal: The goal of this process can be viewed as the inverse of that of natural language understanding (NLU) NLU Vs NLG: NLG maps from meaning to text, while NLU maps from text to meaning. (Juraffsky, Chapter 20) Natural Language Processing (NLP) by Rahman Ali, Lect: QACC, UOP 3 28 September 2024

  4. Definition 2: NLG Natural Language Generation (NLG) is the natural language processing task of generating natural language from a machine representation system such as a knowledge base or a logical form. (http://en.wikipedia.org/wiki/Natural_language_generation, Retrieved: 31 Oct, 2010) Natural Language Processing (NLP) by Rahman Ali, Lect: QACC, UOP 4 28 September 2024

  5. Definition 3: NLG Natural language generation is the process of deliberately constructing a natural language text in order to meet specified communicative goals. [McDonald 1992] Natural Language Processing (NLP) by Rahman Ali, Lect: QACC, UOP 5 28 September 2024

  6. What is NLG? Or Ingredient of NLG Goal: Computer software which produces understandable and appropriate texts in English or other human languages Input: Some underlying non-linguistic representation of information Output: Documents, reports, explanations, help messages, and other kinds of texts Knowledge sources required: Knowledge of target language and of the domain 6

  7. Example System #1: FoG Function: Produces textual weather reports in English and French Input: Graphical/numerical weather depiction User: Environment Canada (Canadian Weather Service) Developer: CoGenTex Status: Fielded, in operational use since 1992 7

  8. FoG: Input 8

  9. FoG: Output 9

  10. Example System #2: PlanDoc Function: Produces a report describing the simulation options that an engineer has explored Input: A simulation log file User: Southwestern Bell Telephone Company (Texas) Developer: Bellcore and Columbia University Status: Fielded, in operational use since 1996 10

  11. PlanDoc: Input RUNID fiberall FIBER 6/19/93 act yes FA 1301 2 1995 FA 1201 2 1995 FA 1401 2 1995 FA 1501 2 1995 ANF co 1103 2 1995 48 ANF 1201 1301 2 1995 24 ANF 1401 1501 2 1995 24 END. 856.0 670.2 11

  12. PlanDoc: Output This saved fiber refinement includes all DLC changes in Run-ID ALLDLC. RUN-ID FIBERALL demanded that PLAN activate fiber for CSAs 1201, 1301, 1401 and 1501 in 1995 Q2. It requested the placement of a 48-fiber cable from the CO to section 1103 and the placement of 24-fiber cables from section 1201 to section 1301 and from section 1401 to section 1501 in the second quarter of 1995. For this refinement, the resulting 20 year route PWE was $856.00K, a $64.11K savings over the BASE plan and the resulting 5 year IFC was $670.20K, a $60.55K savings over the BASE plan. 12

  13. Example System #3: STOP Function: Produces a personalized smoking-cessation leaflet Input: Questionnaire about smoking attitudes, beliefs, history User: NHS (British Health Service) Developer: University of Aberdeen Status: Undergoing clinical evaluation to determine its effectiveness 13

  14. STOP: Input SMOKING QUESTIONNAIRE Please answer by marking the most appropriate box for each question like this: Q1 Have you smoked a cigarette in the last week, even a puff? YES Please complete the following questions NO Please return the questionnaire unanswered in the envelope provided. Thank you. Please read the questions carefully.If you are not sure how to answer, just give the best answer you can. Home situation: Live alone Q2 Live with husband/wife/partner Live with other adults Live with children Q3 Number of children under 16 living at home boys 1 . girls Does anyone else in your household smoke?(If so, please mark all boxes which apply) husband/wife/partner other family member Q4 others Q5 How long have you smoked for? 10 years Tick here if you have smoked for less than a year 14

  15. STOP: Output Dear Dear Ms Ms Cameron Cameron Thank questionnaire that we sent you. It appears from your answers that although you're not planning to stop smoking in the near future, you would like to stop if it was easy. You think it would be difficult to stop because smoking helps you cope with stress, it is something to do when you are bored, and smoking stops you putting on weight. However, you have reasons to be confident of success if you did try to stop, and there are ways of coping with the difficulties. you for taking the trouble to return the smoking 15

  16. Example System #4: TEMSIS Function: Summarizes pollutant information for environmental officials Input: Environmental data + a specific query User: Regional environmental agencies in France and Germany Developer: DFKI GmbH Status: Prototype developed; requirements for fielded system being analyzed 16

  17. TEMSIS: Input Query ((LANGUAGE FRENCH) (GRENZWERTLAND GERMANY) (BESTAETIGE-MS T) (BESTAETIGE-SS T) (MESSSTATION \"Voelklingen City\") (DB-ID \"#2083\") (SCHADSTOFF \"#19\") (ART MAXIMUM) (ZEIT ((JAHR 1998) (MONAT 7) (TAG 21)))) 17

  18. TEMSIS: Output Summary Le 21/7/1998 la station de mesure de V lklingen -City, la valeur moyenne maximale d'une demi-heure (Halbstundenmittelwert) pour l'ozone atteignait 104.0 g/m . Par cons quent, selon le decret MIK (MIK-Verordnung), la valeur limite autoris e de 120 g/m n'a pas t d pass e. Der h chste Halbstundenmittelwert f r Ozon an der Me station V lklingen -City erreichte am 21. 7. 1998 104.0 g/m , womit der gesetzlich zul ssige Grenzwert nach MIK-Verordnung von 120 g/m nicht berschritten wurde. 18

  19. Types of NLG Applications Automated document production weather forecasts, simulation reports, letters, ... Presentation of information to people in an understandable fashion medical records, expert system reasoning, ... Teaching information for students in CAL systems Entertainment jokes (?), stories (??), poetry (???) 19

  20. An Architecture for Generation (Juraffsky, Chapter 20) Natural Language Processing (NLP) by Rahman Ali, Lect: QACC, UOP 20 28 September 2024

  21. An Architecture for Generation (Cont..) Discourse Planner This component starts with a communicative goal and makes all the choices. It selects the content from the knowledge base and then structures that content appropriately. The resulting discourse plan will specify all the choices made for the entire communication, potentially spanning multiple sentences and including other annotations (including hypertext, figures, etc.). Natural Language Processing (NLP) by Rahman Ali, Lect: QACC, UOP 21 28 September 2024

  22. An Architecture for Generation (Cont..) Surface Realizer This component receives the fully specified discourse plan and generates individual sentences as constrained by its lexical and grammatical resources. These resources define the realizer's potential range of output. If the plan specifies multiple-sentence output, the surface realizer is called multiple times. Natural Language Processing (NLP) by Rahman Ali, Lect: QACC, UOP 22 28 September 2024

  23. Component Tasks in NLG 1. Content Determination what information should be conveyed? 2. Discourse Planning order & structure of message set 3. Sentence Aggregation grouping messages into sentences 4. Lexicalization words & phrases for concepts, relations 5. Referring Expression Generation words & phrases for entities 6. Linguistic Realization syntax, morphology, orthography NLG: Overview 23

  24. Typical 3 Typical 3- -Module/Pipelined Architecture Module/Pipelined Architecture goal 1.Content Determination 2. Discourse Planning Text Planner text plan 3. Sentence Aggregation 4. Lexicalization 5. Referring Expressions Sentence Planner Q: How should these be represented? sentence plans 6. Syntax, Morphology, Orthography Linguistic Realizer surface text NLG: Overview 24

  25. Text Plans Common representation : tree Leaf nodes = messages Internal nodes = message groupings Simple text plans: templates OK Complex text plans: require full representation language (e.g., TAMERLAN, DIOGENES) NLG: Overview 25

  26. Sentence Plans Simple: templates (select & fill) Complex: abstract representation (SPL: Sentence Planning Language) NLG: Overview 26

  27. Example SPL Expression (S1/exist :object (01/train :cardinality 20 :relations ((R1/period :value daily) (R2/source :value Aberdeen) (R3/destination :value Glasgow)))) There are 20 trains a day from Aberdeen to Glasgow NLG: Overview 27

  28. Content Determination Messages (raw content) User Model (influences content) Is Reasoning Required? Find a train from Aberdeen to Leeds (It requires two trains to get there) Deep Reasoning Systems represent the user s goals as well as any immediate query utilize plan recognition & reasoning NLG: Overview 28

  29. Discourse Planning Structure messages into a coherent text Example: start with a summary, then give details Discourse relations, e.g.: elaboration: More specifically, X exemplification: For example, X contrast / exception: However, X Rhetorical Structure Theory (RST) NLG: Overview 29

  30. Sentence Aggregation No aggregation (1 sentence / message) Relative Clause ..which leaves at 10am Conjunction ..and the next train is the express Combinations ..and the next train is the express which leaves at 10am NLG: Overview 30

  31. Lexicalization Choosing words to realize concepts or relations Example: (action/change (measure outside_temperature) (delta (quantity/deg_F -10))) The temperature dropped 10 degrees NLG: Overview 31

  32. Lexical Selection Rules (*A-INGEST (AGENT *O-BOB) (PATIENT *O-MILK)) => "drink" (*A-INGEST (AGENT *O-BOB) (PATIENT *O-CHOCOLATE)) => "eat" NLG: Overview 32

  33. Case Creation Additional structure is required to realize the meaning of the semantic representation (*A-KICK (AGENT *O-JOHN) (PATIENT *O-BALL)) "John propelled the ball with his foot" NLG: Overview 33

  34. Case Absorption Word chosen to realize a semantic head also implies the meaning conveyed by a semantic role (*A-FILE-LEGAL-ACTION (AGENT *O-BOB) (PATIENT *O-SUIT) (RECIPIENT *O-ACME)) "Bob sued Acme" NLG: Overview 34

  35. Referring Expression Generation Initial introduction A man in the park looked up Pronouns He saw a bird fly over Definite Descriptions The man covered his head with a newspaper NLG: Overview 35

  36. Fixing Robot Text Start [the engine]i and run [the engine]i until [the engine]i reaches normal operating temperature Start []i and run [the engine]i until [it]i reaches normal temperature Second example introduces ellipsis and anaphora operating NLG: Overview 36

  37. Journalistic Style A with Krohn, a prayers the police, Fernandez told the investigating magistrates today, he trained for the past six months for the assault. If found guilty, the Spaniard faces a prison (Brown and Yule, 1983) dissident attempting aged bayonet at Spanish to 32, approached Fatima priest murder arrested the on was the after Pope Wednesday charged Pope. a while night. here today Juan man he Fernandez armed was According was with saying to sentence of 15-20 years. NLG: Overview 37

  38. Reading/References Daniel Jurafsky, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, Pearson Education,Inc, 2000 . Kamil angielski.info/linguistics/discourse.htm, Retrieved date: Oct 16, 2010. Wi niewski, July 12th, 2007, Discourse Analysis. Retrieved from: http://www.tlumaczenia- M.A.Khan, Text Based Machine Translation System , PhD Thesis, 1995. The Daily News, Jolie was high on cocaine during TV interview: Former drug dealer , dated: 22 Oct, 2010. http://dailymailnews.com/1010/22/ShowBiz/index.php?id=3 M.A.Khan, MACHINE TRANSLATION BEYOND SENTENCE BOUNDARIES ,,In Proceedings of Workshop on Proofing Tools and Language Technologies, July 1-2, Patras University, Greece. www.mabidkhan.com/.../Scientific%20Khyber,%20Vol%201,%202004.pdf Natural Language Processing (NLP) by Rahman Ali, Lect: QACC, UOP 38 28 September 2024

Related