The Future Role of Statistics in AI: Insights from Raj Reddy's Speech at Carnegie Mellon University

Future Role of Statistics in AI
Raj Reddy
Carnegie Mellon University
Pittsburgh, PA 15213
Speech at ISI Kolkatta June 29, 2021
Thank you, Dr Sanghamitra.  Distinguished Professors Dr Dipak Dey and Dr
Bibek Debroy and honored guests, faculty and students
I am glad to have the opportunity to deliver this lecture on the of 128
th
birthday of Dr Mahalanobis. I was asked on his 125
th
 birthday but could not
because of scheduling difficulties. Giving this talk today is 128
th
 birthday is in
more appropriate. For computer scientists, all powers of two are close to our
heart.
Introduction
Today my talk is on the topic of “The future role of Statistics on AI.
Early AI Research concentrated on Knowledge-Based Systems, with
the assumption “Knowledge is God-Given Truth” and does not
change.
Even Einstein famously said that “God does not play Dice”.
Recent advances in AI have shown that all knowledge arises out of
Learning and all learning is inherently Statistical.
However, much of the recent work in Machine Learning has been
focused on very large data sets (Big Data) and the use of unlimited
computation, memory, and bandwidth.
In this talk, we will identify different Strategies used by Humans that
learn from “little data or no data” and
Propose that much needs to be done in Statistical Theory to account
for such phenomena and
Derive new Machine Learning methods that facilitate low-cost
incremental learning, reinforcement learning, and learning by
discovery.
What is AI?
AI is an attempt to automate tasks that are usually thought to be
uniquely Human
Requiring Intelligence, Intuition, Creativity, Innovation, Emotion, Empathy
Usually, By Human coding of Knowledge
Using Heuristics, Rules, and
AI uses Non-Sequential Algorithms
Use of Context
Probabilistic Context
Statistical Models – HMMs and DNNs
Bayesian Networks
AI Principles: Use of Knowledge to Solve Problems
An Intelligent System must
Learn from Experience
Use Vast Amount of Knowledge
Tolerate Error and Ambiguity
Respond in Real Time
Communicate with Humans using Natural Language
Use Symbols and Abstractions
Search Compensates for Lack of Knowledge
Puzzles
Knowledge Compensates for Lack of Search
F=ma;   E = mc
2
Traditional Sources of Knowledge
Formal Knowledge: Learnt in Schools and Universities
Books, Manuals,
Informal Knowledge: Heuristics provided by People
Previous Use of Statistics in AI
Neural Network Learning in the 1950s
Perceptron
Pandemonium
Minsky’s “Steps towards AI” has a survey of efforts in 50s and 60s
Hidden Markov Models in Speech and Language
CMU: Jim Baker’s Thesis 1974
IBM Research: Fred Jelinek and Team
Back Propagation: Neural Networks
Geoff Hinton at CMU 1986
TDNN in Speech by Alex Waibel 1993
Convolutional NNs for Vision by Yann LeCun 1995
Deep Neural Networks Hinton et al Toronto 2009
Techniques and Systems of AI in 20
th
 Century
Human Encoding of Knowledge
Expert Systems
Knowledge Based Systems
Rule Based Systems
Systems That Do Tasks that Require Intelligence
Play Chess
Prove Theorems
Discover Molecular Structure
Systems That Do Tasks That Humans Do Effortlessly
Speak and Hear: Speech Understanding and Dialog
See: Computer Vision and Image Understanding
Use Language: Ambiguous and Non-grammatical Language
Drive a Car
Major Advances in AI in the 20
th
 Century
Enabled by Brute-force, Heuristics, Human Coding of Rules and
Knowledge, and Simple Machine Learning (Pattern Recognition)
World Champion Chess Machine
IBM Deep Blue
Accident Avoiding Car
CMU: No Hands Across America
Robotics
Manufacturing Automation
Disaster Rescue Robots
Speech Recognition Systems
Dictation Machine
Expert Systems
Rule Based Systems
Knowledge Based Systems
AI in the 21
st
 Century (so far)
Discover and Use Data Driven Knowledge Sources
Paradigm Shift in Science
First 3 Paradigms: Experiment, Theory, Simulation
Rutherford, Bohr, Oppenheimer
4
th
 Paradigm: Data Driven Science
Create Next Generation AI systems
Data Driven AI systems
To Solve Previously Unsolved Problems
AI 2.0: Using Previously Unavailable Sources of Data
Knowledge from Big Data:
Data Driven Learning of Models and Algorithms
Knowledge from Crowd Intelligence:
Global Brain: from Individual Intelligence to Collective Intelligence
Knowledge from Unmanned Autonomous Vehicles:
Intelligence from Collaborating Teams of Robots
Automatic Discovery of New Knowledge
Machine Learning using Big Data
Deep Learning
Advances in AI in the 21
st
 Century (so far)
Using Learning Enabled by Big Data and Deep Neural Networks
Language Translation
Google Translate: Any Language to Any Language
Speech to Speech Dialog
Siri, Cortana, Alexa
Autonomous Vehicles
CMU, Stanford, Google, Tesla
Deep Question Answering
IBM’s Watson
Robot Soccer
World Champion Poker
CMU Libratus
No Limit Texas Hold’em Poker
AI and Deep Neural Networks
Strengths
Breakthru in many previously hard task domains
Speech
Vision
Language
Navigation
Limitations
Deep learning requires a large amount of annotated data.
Deep Nets perform well on benchmarked datasets but can fail
badly on real world images outside the dataset.
Any dataset will not to be representative of the complexity of the real
world.
Deep Nets are overly sensitive to changes in the image which
would not fool a human observer
imperceptible changes to the image or adding noise and/or image
occlusion leads to incorrect labels
Statistical Techniques for
Different Types of Human Learning
Rote Learning
Role of Chunks in Psychology Literature
Learning by Teaching
Learning by Debugging
Learning from Example
Learning by Doing
Learning by Discovery
Big Data
New Data
Data from Exploration and Search
Soma Cube Puzzle
Statistical Theory for
Resource Limited Computational Statistics
Tesla Motors Super-Computer
720 Nodes
8 x A100 80GB (5760 GPU
s)
1.8 Exa-Flops (10**18 Flops)
720 Nodes * 8GPUs each *“ 312TFOPS/GPU
2.5PFLOPS/Node
10 PetaByte NVMe @ 1.6 TByte/sec
640 Tbps of Total Switching Capacity
Vision-only Autonomous Driving by Tesla
D
itch radar and lidar sensors on self-driving cars in favor of high-quality
optical cameras
Tesla’s Driving Database
1 million videos of around 10 seconds each and
L
abeled 6 billion objects with depth, velocity and acceleration.
All of this takes up a whopping 1.5 petabytes of storage
We need algorithms for 1 GPU
Missing Statistical Theory
Incremental Learning
One Shot Deep Learning
Few Shot Deep Learning
Algorithms for finite sized datasets
How can we efficiently test these algorithms to
ensure that they work in these enormous datasets if
we can only test them on a finite subset?
Epistemology: Statistical Nature of Knowledge
Initially Everything is Informal and Approximate
Knowledge
As We Formulate Mathematical Formulas and Theories,
Informal Knowledge Becomes Formal, Precise, Predictable
and Repeatable
Invention of Zero and Algorithms for Addition and Multiplication
took Five Hundred Years
No Such Thing as Absolute Truth
Epistemology
https://en.wikipedia.org/wiki/Epistemology
Future Role of ISI in India
Major Stakeholder National Language Technology Mission of
MEITY?
ISI should become the Source of Public Data on Indian
Languages
100K Hours Speech for each Language
100M Words of Text for each Language
Host Supercomputer for AI Apps
1.8 Exaflop System $20M (Equivalent to Tesla System)
Add incrementally over 10 Years?
Become the World Leader in Computational Statistics
90/10 Approximate Algorithms
Incremental Algorithms
One Shot and Few Shot Learning
Slide Note
Embed
Share

Raj Reddy's speech at Carnegie Mellon University delves into the evolving role of statistics in the field of AI, highlighting the shift towards statistical learning and the need for new methods to enable low-cost incremental learning and reinforcement learning. The talk explores the history of AI research, principles of AI, and previous uses of statistics in AI applications, providing valuable insights for the future of AI development.

  • Statistics
  • Artificial Intelligence
  • Machine Learning
  • Raj Reddy
  • Carnegie Mellon University

Uploaded on Sep 14, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Future Role of Statistics in AI Raj Reddy Carnegie Mellon University Pittsburgh, PA 15213 Speech at ISI Kolkatta June 29, 2021 Thank you, Dr Sanghamitra. Distinguished Professors Dr Dipak Dey and Dr Bibek Debroy and honored guests, faculty and students I am glad to have the opportunity to deliver this lecture on the of 128th birthday of Dr Mahalanobis. I was asked on his 125thbirthday but could not because of scheduling difficulties. Giving this talk today is 128thbirthday is in more appropriate. For computer scientists, all powers of two are close to our heart.

  2. Introduction Today my talk is on the topic of The future role of Statistics on AI. Early AI Research concentrated on Knowledge-Based Systems, with the assumption Knowledge is God-Given Truth and does not change. Even Einstein famously said that God does not play Dice . Recent advances in AI have shown that all knowledge arises out of Learning and all learning is inherently Statistical. However, much of the recent work in Machine Learning has been focused on very large data sets (Big Data) and the use of unlimited computation, memory, and bandwidth. In this talk, we will identify different Strategies used by Humans that learn from little data or no data and Propose that much needs to be done in Statistical Theory to account for such phenomena and Derive new Machine Learning methods that facilitate low-cost incremental learning, reinforcement learning, and learning by discovery.

  3. What is AI? AI is an attempt to automate tasks that are usually thought to be uniquely Human Requiring Intelligence, Intuition, Creativity, Innovation, Emotion, Empathy Usually, By Human coding of Knowledge Using Heuristics, Rules, and AI uses Non-Sequential Algorithms Use of Context Probabilistic Context Statistical Models HMMs and DNNs Bayesian Networks

  4. AI Principles: Use of Knowledge to Solve Problems An Intelligent System must Learn from Experience Use Vast Amount of Knowledge Tolerate Error and Ambiguity Respond in Real Time Communicate with Humans using Natural Language Use Symbols and Abstractions Search Compensates for Lack of Knowledge Puzzles Knowledge Compensates for Lack of Search F=ma; E = mc2 Traditional Sources of Knowledge Formal Knowledge: Learnt in Schools and Universities Books, Manuals, Informal Knowledge: Heuristics provided by People

  5. Previous Use of Statistics in AI Neural Network Learning in the 1950s Perceptron Pandemonium Minsky s Steps towards AI has a survey of efforts in 50s and 60s Hidden Markov Models in Speech and Language CMU: Jim Baker s Thesis 1974 IBM Research: Fred Jelinek and Team Back Propagation: Neural Networks Geoff Hinton at CMU 1986 TDNN in Speech by Alex Waibel 1993 Convolutional NNs for Vision by Yann LeCun 1995 Deep Neural Networks Hinton et al Toronto 2009

  6. Techniques and Systems of AI in 20thCentury Human Encoding of Knowledge Expert Systems Knowledge Based Systems Rule Based Systems Systems That Do Tasks that Require Intelligence Play Chess Prove Theorems Discover Molecular Structure Systems That Do Tasks That Humans Do Effortlessly Speak and Hear: Speech Understanding and Dialog See: Computer Vision and Image Understanding Use Language: Ambiguous and Non-grammatical Language Drive a Car

  7. Major Advances in AI in the 20thCentury Enabled by Brute-force, Heuristics, Human Coding of Rules and Knowledge, and Simple Machine Learning (Pattern Recognition) World Champion Chess Machine IBM Deep Blue Accident Avoiding Car CMU: No Hands Across America Robotics Manufacturing Automation Disaster Rescue Robots Speech Recognition Systems Dictation Machine Expert Systems Rule Based Systems Knowledge Based Systems

  8. AI in the 21stCentury (so far) Discover and Use Data Driven Knowledge Sources Paradigm Shift in Science First 3 Paradigms: Experiment, Theory, Simulation Rutherford, Bohr, Oppenheimer 4thParadigm: Data Driven Science Create Next Generation AI systems Data Driven AI systems To Solve Previously Unsolved Problems AI 2.0: Using Previously Unavailable Sources of Data Knowledge from Big Data: Data Driven Learning of Models and Algorithms Knowledge from Crowd Intelligence: Global Brain: from Individual Intelligence to Collective Intelligence Knowledge from Unmanned Autonomous Vehicles: Intelligence from Collaborating Teams of Robots Automatic Discovery of New Knowledge Machine Learning using Big Data Deep Learning

  9. Advances in AI in the 21stCentury (so far) Using Learning Enabled by Big Data and Deep Neural Networks Language Translation Google Translate: Any Language to Any Language Speech to Speech Dialog Siri, Cortana, Alexa Autonomous Vehicles CMU, Stanford, Google, Tesla Deep Question Answering IBM s Watson Robot Soccer World Champion Poker CMU Libratus No Limit Texas Hold em Poker

  10. AI and Deep Neural Networks Strengths Breakthru in many previously hard task domains Speech Vision Language Navigation Limitations Deep learning requires a large amount of annotated data. Deep Nets perform well on benchmarked datasets but can fail badly on real world images outside the dataset. Any dataset will not to be representative of the complexity of the real world. Deep Nets are overly sensitive to changes in the image which would not fool a human observer imperceptible changes to the image or adding noise and/or image occlusion leads to incorrect labels

  11. Statistical Techniques for Different Types of Human Learning Rote Learning Role of Chunks in Psychology Literature Learning by Teaching Learning by Debugging Learning from Example Learning by Doing Learning by Discovery Big Data New Data Data from Exploration and Search Soma Cube Puzzle

  12. Statistical Theory for Resource Limited Computational Statistics Tesla Motors Super-Computer 720 Nodes 8 x A100 80GB (5760 GPUs) 1.8 Exa-Flops (10**18 Flops) 720 Nodes * 8GPUs each * 312TFOPS/GPU 2.5PFLOPS/Node 10 PetaByte NVMe @ 1.6 TByte/sec 640 Tbps of Total Switching Capacity Vision-only Autonomous Driving by Tesla Ditch radar and lidar sensors on self-driving cars in favor of high-quality optical cameras Tesla s Driving Database 1 million videos of around 10 seconds each and Labeled 6 billion objects with depth, velocity and acceleration. All of this takes up a whopping 1.5 petabytes of storage We need algorithms for 1 GPU

  13. Missing Statistical Theory Incremental Learning One Shot Deep Learning Few Shot Deep Learning Algorithms for finite sized datasets How can we efficiently test these algorithms to ensure that they work in these enormous datasets if we can only test them on a finite subset?

  14. Epistemology: Statistical Nature of Knowledge Initially Everything is Informal and Approximate Knowledge As We Formulate Mathematical Formulas and Theories, Informal Knowledge Becomes Formal, Precise, Predictable and Repeatable Invention of Zero and Algorithms for Addition and Multiplication took Five Hundred Years No Such Thing as Absolute Truth Epistemology https://en.wikipedia.org/wiki/Epistemology

  15. Future Role of ISI in India Major Stakeholder National Language Technology Mission of MEITY? ISI should become the Source of Public Data on Indian Languages 100K Hours Speech for each Language 100M Words of Text for each Language Host Supercomputer for AI Apps 1.8 Exaflop System $20M (Equivalent to Tesla System) Add incrementally over 10 Years? Become the World Leader in Computational Statistics 90/10 Approximate Algorithms Incremental Algorithms One Shot and Few Shot Learning

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#