Introduction to Natural Language Processing and its Applications

Slide Note
Embed
Share

Natural Language Processing (NLP) explores the algorithms and principles behind enabling computers to understand and generate human language. It involves processing large amounts of machine-readable text data and developing systems like text analytics, conversational agents (e.g., Siri, Cortana, Google Now), and machine translation. NLP also includes text analytics for data mining, sentiment analysis, entity identification, and concept extraction from various forms of user-generated media like weblogs and discussion forums.


Uploaded on Oct 01, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Natural Language Processing Lecture 1 8/25/2015 CSCI 5832

  2. Natural Language Processing We re going to study what goes into getting computers to perform useful and interesting tasks involving human language. Also known as human language technology and computational linguistics . 10/1/2024 2 Speech and Language Processing - Jurafsky and Martin

  3. Natural Language Processing More specifically, it s about the algorithms used to process language, the formal basis for those algorithms, and the facts about human language that allow those algorithms to work. 10/1/2024 3 Speech and Language Processing - Jurafsky and Martin

  4. Why Should You Care? Three trends 1. An enormous amount of information is now available in machine readable form as natural language text (newspapers, web pages, medical records, financial filings, product reviews, discussion forums, etc.) 2. Conversational agents are becoming an important form of human-computer communication 3. Much of human-human interaction is now mediated by computers via social media Collectively, this means that copious data is available to be used in the development of NLP systems. 10/1/2024 4 Speech and Language Processing - Jurafsky and Martin

  5. Applications Let s take a quick look at three prominent application areas Text analytics Conversational agents Siri, Cortana, Google now Machine translation 10/1/2024 5 Speech and Language Processing - Jurafsky and Martin

  6. Text Analytics Data-mining of weblogs, microblogs, discussion forums, message boards, user groups, and other forms of user-generated media Product marketing information Political opinion tracking Social network analysis Buzz analysis (what s hot, what topics are people talking about right now) 10/1/2024 6 Speech and Language Processing - Jurafsky and Martin

  7. Text Analytics Typically this involves the extraction of limited kinds of semantic and pragmatic information from texts Entity mentions Concept identification Relational structure Sentiment 10/1/2024 7 Speech and Language Processing - Jurafsky and Martin

  8. Demo 10/1/2024 8 Speech and Language Processing - Jurafsky and Martin

  9. Conversational Agents Combine Speech recognition/synthesis Question answering From the web and from structured information sources (freebase, dbpedia, yago, etc.) Simple agent-like abilities Create/edit calendar entries Reminders Directions Invoking/interacting with other apps 10/1/2024 9 Speech and Language Processing - Jurafsky and Martin

  10. Question Answering Traditional information retrieval provides documents/resources that provide users with what they need to satisfy their information needs. Question answering on the other hand directly provides an answer to information needs posed as questions. 10/1/2024 10 Speech and Language Processing - Jurafsky and Martin

  11. Watson 10/1/2024 11 Speech and Language Processing - Jurafsky and Martin

  12. Machine Translation The automatic translation of texts between languages is one of the oldest non-numerical applications in Computer Science. In the past 15 years, or so, MT has gone from a niche academic curiosity to a robust commercial industry. 10/1/2024 12 Speech and Language Processing - Jurafsky and Martin

  13. Demo 10/1/2024 13 Speech and Language Processing - Jurafsky and Martin

  14. How? All of these applications operate by exploiting underlying regularities underlying all human languages. Sometimes in complex ways, sometimes in pretty trivial ways. Language structure Formal models Practical applications 10/1/2024 14 Speech and Language Processing - Jurafsky and Martin

  15. Course Material We ll be intermingling discussions of: Linguistic topics Morphology, syntax, semantics, discourse Formal systems Regular languages, context-free grammars, probabilistic models, neural networks Applications Question answering, machine translation, information extraction, summarization 10/1/2024 15 Speech and Language Processing - Jurafsky and Martin

  16. Topics: Linguistics Word-level processing Syntactic processing Lexical and compositional semantics Discourse structure 10/1/2024 16 Speech and Language Processing - Jurafsky and Martin

  17. Topics: Techniques Finite-state methods Context-free methods Probabilistic models Neural network models Supervised machine learning methods 10/1/2024 17 Speech and Language Processing - Jurafsky and Martin

  18. Categories of Knowledge Phonology Morphology Syntax Semantics Pragmatics Discourse Each kind of knowledge has associated with it an encapsulated set of processes that make use of it. Interfaces are defined that allow the various levels to communicate. This often leads to a pipeline architecture. Syntactic Analysis Semantic Interpretation Morphological Processing Context 10/1/2024 18 Speech and Language Processing - Jurafsky and Martin

  19. Ambiguity Ambiguity is a fundamental problem in computational linguistics Hence, resolving, or managing, ambiguity is a recurrent theme 10/1/2024 19 Speech and Language Processing - Jurafsky and Martin

  20. Ambiguity Find at least 5 meanings of this sentence: I made her duck 10/1/2024 20 Speech and Language Processing - Jurafsky and Martin

  21. Ambiguity Find at least 5 meanings of this sentence: I made her duck I cooked waterfowl for her benefit (to eat) I cooked waterfowl belonging to her I created the (ceramic?) duck she owns I caused her to quickly lower her upper body I waved my magic wand and turned her into undifferentiated waterfowl 10/1/2024 21 Speech and Language Processing - Jurafsky and Martin

  22. Ambiguity is Pervasive I caused her to quickly lower her head or body Lexical category: duck can be a noun or verb I cooked waterfowl belonging to her. Lexical category: her can be a possessive ( of her ) or dative ( for her ) pronoun I made the (ceramic) duck statue she owns Lexical Semantics: make can mean create or cook , and about 100 other things as well 10/1/2024 22 Speech and Language Processing - Jurafsky and Martin

  23. Ambiguity is Pervasive Grammar: Make can be: Transitive: (verb has a noun direct object) I cooked [waterfowl belonging to her] Ditransitive: (verb has 2 noun arguments) I made [her] (into) [undifferentiated waterfowl] Action-transitive (verb has a direct object and another verb) I caused [her] [to move her body] 10/1/2024 23 Speech and Language Processing - Jurafsky and Martin

  24. Ambiguity is Pervasive Not to mention Phonetics! I mate or duck I m eight or duck Eye maid; her duck Aye mate, her duck I maid her duck I m aid her duck I mate her duck I m ate her duck I m ate or duck I mate or duck 10/1/2024 24 Speech and Language Processing - Jurafsky and Martin

  25. Problem Remember our pipeline... Syntactic Analysis Semantic Interpretation Morphological Processing Context 10/1/2024 25 Speech and Language Processing - Jurafsky and Martin

  26. It really looks like this Semantic Interpretation Interpretation Interpretation Interpretation Interpretation Interpretation Interpretation Interpretation Interpretation Semantic Semantic Semantic Semantic Semantic Semantic Syntactic Analysis Analysis Analysis Analysis Analysis Syntactic Syntactic Syntactic Syntactic Syntactic Analysis Semantic Semantic Semantic Interpretation Interpretation Interpretation Interpretation Interpretation Interpretation Interpretation Interpretation Morphological Processing Semantic Semantic Semantic Semantic Semantic Semantic Syntactic Analysis Semantic Semantic Interpretation 10/1/2024 26 Speech and Language Processing - Jurafsky and Martin

  27. Dealing with Ambiguity Various possible approaches 1. Tightly coupled interaction among processing levels; knowledge from other levels can help decide among choices at ambiguous levels. 2. Pipeline processing that ignores ambiguity as it occurs and hopes that other levels can eliminate incorrect structures. 3. Probabilistic approaches based on making the most likely choices 1. Or passing along n-best choices 10/1/2024 27 Speech and Language Processing - Jurafsky and Martin

  28. Models and Algorithms By models we mean the formalisms that are used to capture the various kinds of linguistic knowledge we need. Algorithms are then used to manipulate the knowledge representations needed to tackle the task at hand. 10/1/2024 28 Speech and Language Processing - Jurafsky and Martin

  29. Models Finite state machines Rule-based and logic-based approaches Probabilistic models Neural network models 10/1/2024 29 Speech and Language Processing - Jurafsky and Martin

  30. Kinds of Algorithms In particular.. State-space search To manage the problem of making choices during processing when we lack the information needed to make the right choice Dynamic programming To avoid having to redo work during the course of a state-space search CKY, Earley, Minimum Edit Distance, Viterbi, Baum-Welch Classifiers Machine learning based classifiers that are trained to make decisions based on features extracted from the local context 10/1/2024 30 Speech and Language Processing - Jurafsky and Martin

  31. Administrative Stuff Web page Moodle Reasonable preparation Requirements 10/1/2024 31 Speech and Language Processing - Jurafsky and Martin

  32. Moodle This will be the main website for the course. Go to moodle.cs.colorado.edu and self enroll into the course using the key 10/1/2024 32 Speech and Language Processing - Jurafsky and Martin

  33. Be Boulder Anywhere For remote students Don t fall behind on the lectures. We re covering a lot of material in a short period of time You have a standing 1 week delay on assignment deadlines and on the quiz dates All students can access the class lectures via beboulderanywhere.colorado.edu 10/1/2024 33 Speech and Language Processing - Jurafsky and Martin

  34. Preparation Ability to program Basic algorithm and data structure analysis Some exposure to logic Exposure to basic concepts in probability Familiarity with linguistics, psychology, and philosophy Ability to write well in English 10/1/2024 34 Speech and Language Processing - Jurafsky and Martin

  35. Requirements Readings: Speech and Language Processing by Jurafsky and Martin, 2ed. Prentice-Hall 2009 Various draft chapters that are in prep for a new edition. A few conference or journal papers Around 5 assignments Mainly programming and written/problem sets 2 quizzes Final comprehensive (sort of) exam on Thursday, December 17 from 1:30 to 4:00 Don t make plans to leave before the final 10/1/2024 35 Speech and Language Processing - Jurafsky and Martin

  36. Programming Most of the programming will be done in Python. Free and works on Windows, Macs, and Linux Easy to install Easy to learn 10/1/2024 36 Speech and Language Processing - Jurafsky and Martin

  37. Grading Assignments 30% Midterms 30% Final 30% Participation 10% 10/1/2024 37 Speech and Language Processing - Jurafsky and Martin

Related


More Related Content