Data Flow in Machine Learning Systems

 
Data Flow in an ML system
 
d.ML, Winter 2018-19
Model
Evaluating
Model Building
Data
Cleaning
Data
Model
Output
 
Data design
 
Model design
 
Output design
Model
Interaction
Model
Evaluating
Model Building
Data
Cleaning
Data
Model
Output
 
Data design
 
Model design
 
Output design
Model
Evaluating
Model Building
Data
Cleaning
Data
Model
Output
 
Data design
 
Model design
 
Output design
 
Data
 
All machine learning models need data
Where does your data come from?
What is the type() of each of the variables (columns)
Key vocab:
Training data 
- the data you use to train your ML model
Data type
 - the type/format of your data (string/integer)
 
Data Cleaning
 
Format the data in a way that the computer can read it
Might choose to exclude missing values
Explore your data - look for trends that might inform you
Remember - how was your data collected?
How is it going to be used?
Key vocab:
Normalizing
Remove NA
Model
Evaluating
Model Building
Data
Cleaning
Data
Model
Output
 
Data design
 
Model design
 
Output design
 
Model Building
 
Ask yourself: What type of problem are you trying to solve?
Data + Algorithm = 
model
Algorithm:
Clustering, Regression, Decision Tree, etc.
Key vocab:
Supervised vs Unsupervised learning
Supervised - knowing what the data should be, categorizing
Unsupervised - letting the ML find patterns for you
Algorithm
 
What can Machine Learning do?
 
Classification
 of new data
Dog or cat?
Find 
trends
 and 
patterns
 (regression, clustering)
What can’t ML do?
Clean your data!
Identify patterns that ARE NOT in the data
 
Model Evaluating
 
How well can your model [predict] unseen data?
Key vocab:
Test Data
Precision
Recall
Confidence Interval
Model
Evaluating
Model Building
Data
Cleaning
Data
Model
Output
 
Data design
 
Model design
 
Output design
 
Model Output
 
What will the output of your model look like?
Key vocab:
Confidence Interval
Bayesian
 
https://teachablemachine.withgoogle.com/
 
Old slides
Model
Evaluating
Model Building
Data
Cleaning
Data
 
Does this
predict well?
 
Supervised vs
unsupervised
 
Select an
algorithm based
on the problem
you’re trying to
solve
Model
Evaluating
Model Building
Data
Cleaning
Data
 
Does this
predict well?
 
Supervised vs
unsupervised
 
Select an
algorithm based
on the problem
you’re trying to
solve
 
Data
 
All machine learning models need data
Where does your data come from?
What is the type() of each of the variables (columns)
Key vocab:
Training data - the data you use to train your ML model
Data type - the type/format of your data (string/integer)
 
Data - try for yourself!
 
Open Python (premade workbook - just run code)
Model
Evaluating
Model Building
Data
Cleaning
Data
 
Does this
predict well?
 
Supervised vs
unsupervised
 
Select an
algorithm based
on the problem
you’re trying to
solve
 
Data Cleaning
 
Format the data in a way that the computer can read it
Might choose to exclude missing values
Explore your data - look for trends that might inform you
Remember - how was your data collected?
How is it going to be used?
Key vocab:
Normalizing
Remove NA
 
Data Cleaning - try for yourself!
 
Premade python notebook
Model
Evaluating
Model Building
Data
Cleaning
Data
 
Does this
predict well?
 
Supervised vs
unsupervised
 
Select an
algorithm based
on the problem
you’re trying to
solve
 
Model Building
 
Ask yourself: What type of problem are you trying to solve?
D
a
t
a
 
+
 
A
l
g
o
r
i
t
h
m
 
=
 
m
o
d
e
l
Algorithm:
Clustering, Regression, Decision Tree, etc.
Key vocab:
Supervised vs unsupervised learning
Supervised - knowing what the data should be, categorizing
Unsupervised - letting the ML find patterns for you
Algorithm
 
What can Machine Learning do?
 
Classification of new data
Dog or cat?
Find trends and patterns (regression, clustering)
What can’t ML do?
Clean your data!
Identify patterns that ARE NOT in the data
Model
Evaluating
Model Building
Data
Cleaning
Data
 
Does this
predict well?
 
Select an
algorithm based
on the problem
you’re trying to
solve
 
Model Evaluating
 
How well can your model [predict] unseen data?
Key vocab:
Test Data
Precision
Recall
Confidence Interval
 
FROM MELODY IVORY - do not use
Slide Note

Data flow in an ML system

Example: https://teachablemachine.withgoogle.com/

Labs based on: https://github.com/dlab-berkeley/R-Fundamentals

"Simple outline of learning"

A learning system can be described at a high level as such

Step one observe one or more examples, step two find relevant features to interpret, step three create a rule that evaluates your features, apply that rule and output the result, and optionally to observe a "true" result and use that to revise your features and rules. This is how humans learn from experience. Machine learning operates on the same structural principles.

Some people have outlined ML as taking up three 'categories' -- detection, prediction, and generation. [explain the three]. However, as designers, this classification limits our creative thinking skills, by ignoring the different kinds of human machine interaction that take place in between these distinctions. There's predictive detection, predictive generation, generative detection and so on.

Embed
Share

Explore the intricate data flow within machine learning systems through the stages of data design, model building, data cleaning, and evaluation. Learn about the importance of data types, training data, and data normalization in creating effective machine learning models.

  • Machine Learning
  • Data Flow
  • Model Building
  • Data Cleaning
  • Evaluation

Uploaded on Aug 04, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Data Flow in an ML system d.ML, Winter 2018-19

  2. Data design Model design Output design Data Model Interaction Model Evaluating Data Cleaning Model Output Model Building

  3. Data design Model design Output design Data Model Evaluating Data Cleaning Model Output Model Building

  4. Data design Model design Output design Data Model Evaluating Data Cleaning Model Output Model Building

  5. Data All machine learning models need data Where does your data come from? What is the type() of each of the variables (columns) Animal Legs Furry Sound Cat 4 Yes Meow Dog 4 Yes Woof Pig 4 No Oink Key vocab: Lizard 4 No N/A ... ... ... ... Training data - the data you use to train your ML model Data type - the type/format of your data (string/integer)

  6. Data Cleaning Format the data in a way that the computer can read it Might choose to exclude missing values Explore your data - look for trends that might inform you Remember - how was your data collected? How is it going to be used? Animal Legs Furry Sound Cat 4 Yes Meow Dog 4 Yes Woof Pig 4 No Oink Lizard 4 No N/A ... ... ... ... Key vocab: Animal Legs Furry Normalizing Remove NA Cat 4 1 Dog 4 1 Pig 4 0 Lizard 4 0 ... ... ...

  7. Data design Model design Output design Data Model Evaluating Data Cleaning Model Output Model Building

  8. Model Building Ask yourself: What type of problem are you trying to solve? Data + Algorithm = model Algorithm: Clustering, Regression, Decision Tree, etc. Key vocab: Supervised vs Unsupervised learning Supervised - knowing what the data should be, categorizing Unsupervised - letting the ML find patterns for you Algorithm

  9. What can Machine Learning do? Classification of new data Dog or cat? Find trends and patterns (regression, clustering) Animal Legs Furry Sound Cat 4 Yes Meow Dog 4 Yes Woof What can t ML do? Pig 4 No Oink Lizard 4 No N/A Clean your data! Identify patterns that ARE NOT in the data ... ... ... ... Animal Legs Furry Cat 4 1 Dog 4 1 Pig 4 0 Lizard 4 0 ... ... ...

  10. Model Evaluating How well can your model [predict] unseen data? Animal Legs Furry Cat 4 1 Key vocab: Dog 4 1 Pig 4 0 Test Data Precision Recall Confidence Interval Lizard 4 0 ... ... ... Animal Legs Furry Cat 4 1 Pig 4 0 Parrot 2 0 ... ... ...

  11. Data design Model design Output design Data Model Evaluating Data Cleaning Model Output Model Building

  12. Model Output What will the output of your model look like? Key vocab: Confidence Interval Bayesian

  13. https://teachablemachine.withgoogle.com/

  14. Old slides

  15. Animal Legs Furry Cat 4 1 Pig 4 0 Parrot 2 0 ... ... ... Data Data Cleaning Model Evaluating Model Building Select an algorithm based on the problem you re trying to solve Does this predict well? Animal Legs Furry Sound Animal Legs Furry Cat 4 Yes Meow Cat 4 1 Supervised vs unsupervised Dog 4 Yes Woof Dog 4 1 Pig 4 No Oink Pig 4 0 Lizard 4 No N/A Lizard 4 0 ... ... ... ... ... ... ...

  16. Animal Legs Furry Cat 4 1 Pig 4 0 Parrot 2 0 ... ... ... Data Data Cleaning Model Evaluating Model Building Select an algorithm based on the problem you re trying to solve Does this predict well? Animal Legs Furry Sound Animal Legs Furry Cat 4 Yes Meow Cat 4 1 Supervised vs unsupervised Dog 4 Yes Woof Dog 4 1 Pig 4 No Oink Pig 4 0 Lizard 4 No N/A Lizard 4 0 ... ... ... ... ... ... ...

  17. Data All machine learning models need data Where does your data come from? What is the type() of each of the variables (columns) Key vocab: Animal Legs Furry Sound Cat 4 Yes Meow Training data - the data you use to train your ML model Data type - the type/format of your data (string/integer) Dog 4 Yes Woof Pig 4 No Oink Lizard 4 No N/A ... ... ... ...

  18. Animal Legs Furry Sound Data - try for yourself! Cat 4 Yes Meow Dog 4 Yes Woof Open Python (premade workbook - just run code) Pig 4 No Oink Lizard 4 No N/A ... ... ... ...

  19. Animal Legs Furry Cat 4 1 Pig 4 0 Parrot 2 0 ... ... ... Data Data Cleaning Model Evaluating Model Building Select an algorithm based on the problem you re trying to solve Does this predict well? Animal Legs Furry Sound Animal Legs Furry Cat 4 Yes Meow Cat 4 1 Supervised vs unsupervised Dog 4 Yes Woof Dog 4 1 Pig 4 No Oink Pig 4 0 Lizard 4 No N/A Lizard 4 0 ... ... ... ... ... ... ...

  20. Animal Legs Furry Sound Data Cleaning Cat 4 Yes Meow Dog 4 Yes Woof Format the data in a way that the computer can read it Might choose to exclude missing values Explore your data - look for trends that might inform you Remember - how was your data collected? How is it going to be used? Pig 4 No Oink Lizard 4 No N/A ... ... ... ... Animal Legs Furry Key vocab: Cat 4 1 Dog 4 1 Normalizing Remove NA Pig 4 0 Lizard 4 0 ... ... ...

  21. Animal Legs Furry Sound Data Cleaning - try for yourself! Cat 4 Yes Meow Dog 4 Yes Woof Premade python notebook Pig 4 No Oink Lizard 4 No N/A ... ... ... ... Animal Legs Furry Cat 4 1 Dog 4 1 Pig 4 0 Lizard 4 0 ... ... ...

  22. Animal Legs Furry Cat 4 1 Pig 4 0 Parrot 2 0 ... ... ... Data Data Cleaning Model Evaluating Model Building Select an algorithm based on the problem you re trying to solve Does this predict well? Animal Legs Furry Sound Animal Legs Furry Cat 4 Yes Meow Cat 4 1 Supervised vs unsupervised Dog 4 Yes Woof Dog 4 1 Pig 4 No Oink Pig 4 0 Lizard 4 No N/A Lizard 4 0 ... ... ... ... ... ... ...

  23. Model Building Ask yourself: What type of problem are you trying to solve? Data + Algorithm = model Algorithm: Clustering, Regression, Decision Tree, etc. Key vocab: Supervised vs unsupervised learning Supervised - knowing what the data should be, categorizing Unsupervised - letting the ML find patterns for you Algorithm

  24. Animal Legs Furry Sound What can Machine Learning do? Cat 4 Yes Meow Dog 4 Yes Woof Pig 4 No Oink Classification of new data Dog or cat? Find trends and patterns (regression, clustering) Lizard 4 No N/A ... ... ... ... Animal Legs Furry What can t ML do? Cat 4 1 Dog 4 1 Clean your data! Identify patterns that ARE NOT in the data Pig 4 0 Lizard 4 0 ... ... ...

  25. Animal Legs Furry Cat 4 1 Pig 4 0 Parrot 2 0 ... ... ... Data Data Cleaning Model Evaluating Model Building Select an algorithm based on the problem you re trying to solve Does this predict well? Animal Legs Furry Sound Animal Legs Furry Cat 4 Yes Meow Cat 4 1 Dog 4 Yes Woof Dog 4 1 Pig 4 No Oink Pig 4 0 Lizard 4 No N/A Lizard 4 0 ... ... ... ... ... ... ...

  26. Animal Legs Furry Model Evaluating Cat 4 1 Dog 4 1 Pig 4 0 How well can your model [predict] unseen data? Lizard 4 0 ... ... ... Key vocab: Test Data Precision Recall Confidence Interval Animal Legs Furry Cat 4 1 Pig 4 0 Parrot 2 0 ... ... ...

  27. FROM MELODY IVORY - do not use

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#