The Future Role of Statistics in AI: Insights from Raj Reddy's Speech at Carnegie Mellon University

Slide Note
Embed
Share

Raj Reddy's speech at Carnegie Mellon University delves into the evolving role of statistics in the field of AI, highlighting the shift towards statistical learning and the need for new methods to enable low-cost incremental learning and reinforcement learning. The talk explores the history of AI research, principles of AI, and previous uses of statistics in AI applications, providing valuable insights for the future of AI development.


Uploaded on Sep 14, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Future Role of Statistics in AI Raj Reddy Carnegie Mellon University Pittsburgh, PA 15213 Speech at ISI Kolkatta June 29, 2021 Thank you, Dr Sanghamitra. Distinguished Professors Dr Dipak Dey and Dr Bibek Debroy and honored guests, faculty and students I am glad to have the opportunity to deliver this lecture on the of 128th birthday of Dr Mahalanobis. I was asked on his 125thbirthday but could not because of scheduling difficulties. Giving this talk today is 128thbirthday is in more appropriate. For computer scientists, all powers of two are close to our heart.

  2. Introduction Today my talk is on the topic of The future role of Statistics on AI. Early AI Research concentrated on Knowledge-Based Systems, with the assumption Knowledge is God-Given Truth and does not change. Even Einstein famously said that God does not play Dice . Recent advances in AI have shown that all knowledge arises out of Learning and all learning is inherently Statistical. However, much of the recent work in Machine Learning has been focused on very large data sets (Big Data) and the use of unlimited computation, memory, and bandwidth. In this talk, we will identify different Strategies used by Humans that learn from little data or no data and Propose that much needs to be done in Statistical Theory to account for such phenomena and Derive new Machine Learning methods that facilitate low-cost incremental learning, reinforcement learning, and learning by discovery.

  3. What is AI? AI is an attempt to automate tasks that are usually thought to be uniquely Human Requiring Intelligence, Intuition, Creativity, Innovation, Emotion, Empathy Usually, By Human coding of Knowledge Using Heuristics, Rules, and AI uses Non-Sequential Algorithms Use of Context Probabilistic Context Statistical Models HMMs and DNNs Bayesian Networks

  4. AI Principles: Use of Knowledge to Solve Problems An Intelligent System must Learn from Experience Use Vast Amount of Knowledge Tolerate Error and Ambiguity Respond in Real Time Communicate with Humans using Natural Language Use Symbols and Abstractions Search Compensates for Lack of Knowledge Puzzles Knowledge Compensates for Lack of Search F=ma; E = mc2 Traditional Sources of Knowledge Formal Knowledge: Learnt in Schools and Universities Books, Manuals, Informal Knowledge: Heuristics provided by People

  5. Previous Use of Statistics in AI Neural Network Learning in the 1950s Perceptron Pandemonium Minsky s Steps towards AI has a survey of efforts in 50s and 60s Hidden Markov Models in Speech and Language CMU: Jim Baker s Thesis 1974 IBM Research: Fred Jelinek and Team Back Propagation: Neural Networks Geoff Hinton at CMU 1986 TDNN in Speech by Alex Waibel 1993 Convolutional NNs for Vision by Yann LeCun 1995 Deep Neural Networks Hinton et al Toronto 2009

  6. Techniques and Systems of AI in 20thCentury Human Encoding of Knowledge Expert Systems Knowledge Based Systems Rule Based Systems Systems That Do Tasks that Require Intelligence Play Chess Prove Theorems Discover Molecular Structure Systems That Do Tasks That Humans Do Effortlessly Speak and Hear: Speech Understanding and Dialog See: Computer Vision and Image Understanding Use Language: Ambiguous and Non-grammatical Language Drive a Car

  7. Major Advances in AI in the 20thCentury Enabled by Brute-force, Heuristics, Human Coding of Rules and Knowledge, and Simple Machine Learning (Pattern Recognition) World Champion Chess Machine IBM Deep Blue Accident Avoiding Car CMU: No Hands Across America Robotics Manufacturing Automation Disaster Rescue Robots Speech Recognition Systems Dictation Machine Expert Systems Rule Based Systems Knowledge Based Systems

  8. AI in the 21stCentury (so far) Discover and Use Data Driven Knowledge Sources Paradigm Shift in Science First 3 Paradigms: Experiment, Theory, Simulation Rutherford, Bohr, Oppenheimer 4thParadigm: Data Driven Science Create Next Generation AI systems Data Driven AI systems To Solve Previously Unsolved Problems AI 2.0: Using Previously Unavailable Sources of Data Knowledge from Big Data: Data Driven Learning of Models and Algorithms Knowledge from Crowd Intelligence: Global Brain: from Individual Intelligence to Collective Intelligence Knowledge from Unmanned Autonomous Vehicles: Intelligence from Collaborating Teams of Robots Automatic Discovery of New Knowledge Machine Learning using Big Data Deep Learning

  9. Advances in AI in the 21stCentury (so far) Using Learning Enabled by Big Data and Deep Neural Networks Language Translation Google Translate: Any Language to Any Language Speech to Speech Dialog Siri, Cortana, Alexa Autonomous Vehicles CMU, Stanford, Google, Tesla Deep Question Answering IBM s Watson Robot Soccer World Champion Poker CMU Libratus No Limit Texas Hold em Poker

  10. AI and Deep Neural Networks Strengths Breakthru in many previously hard task domains Speech Vision Language Navigation Limitations Deep learning requires a large amount of annotated data. Deep Nets perform well on benchmarked datasets but can fail badly on real world images outside the dataset. Any dataset will not to be representative of the complexity of the real world. Deep Nets are overly sensitive to changes in the image which would not fool a human observer imperceptible changes to the image or adding noise and/or image occlusion leads to incorrect labels

  11. Statistical Techniques for Different Types of Human Learning Rote Learning Role of Chunks in Psychology Literature Learning by Teaching Learning by Debugging Learning from Example Learning by Doing Learning by Discovery Big Data New Data Data from Exploration and Search Soma Cube Puzzle

  12. Statistical Theory for Resource Limited Computational Statistics Tesla Motors Super-Computer 720 Nodes 8 x A100 80GB (5760 GPUs) 1.8 Exa-Flops (10**18 Flops) 720 Nodes * 8GPUs each * 312TFOPS/GPU 2.5PFLOPS/Node 10 PetaByte NVMe @ 1.6 TByte/sec 640 Tbps of Total Switching Capacity Vision-only Autonomous Driving by Tesla Ditch radar and lidar sensors on self-driving cars in favor of high-quality optical cameras Tesla s Driving Database 1 million videos of around 10 seconds each and Labeled 6 billion objects with depth, velocity and acceleration. All of this takes up a whopping 1.5 petabytes of storage We need algorithms for 1 GPU

  13. Missing Statistical Theory Incremental Learning One Shot Deep Learning Few Shot Deep Learning Algorithms for finite sized datasets How can we efficiently test these algorithms to ensure that they work in these enormous datasets if we can only test them on a finite subset?

  14. Epistemology: Statistical Nature of Knowledge Initially Everything is Informal and Approximate Knowledge As We Formulate Mathematical Formulas and Theories, Informal Knowledge Becomes Formal, Precise, Predictable and Repeatable Invention of Zero and Algorithms for Addition and Multiplication took Five Hundred Years No Such Thing as Absolute Truth Epistemology https://en.wikipedia.org/wiki/Epistemology

  15. Future Role of ISI in India Major Stakeholder National Language Technology Mission of MEITY? ISI should become the Source of Public Data on Indian Languages 100K Hours Speech for each Language 100M Words of Text for each Language Host Supercomputer for AI Apps 1.8 Exaflop System $20M (Equivalent to Tesla System) Add incrementally over 10 Years? Become the World Leader in Computational Statistics 90/10 Approximate Algorithms Incremental Algorithms One Shot and Few Shot Learning

Related


More Related Content