Decision Trees: A Visual Guide

Lesson 5
Decision Tree (Rule Based Approach)
Example
Example
Features
Example
Class
Example
Given : 
<sunny, cool, high, true>
 
Predict, if there will be a match
?
Example
Given : 
<sunny, cool, high, true>
 
Predict, if there will be a match
?
Assume that I have a set of rules:
-
If (
(lookout=sunny) 
and 
( humudity=high) 
and
(windy=false)) then (yes) else (no)
-
If 
(lookout=overcast) then (yes)
-
If (
(lookout=sunny) 
and 
( humudity=high)) then
(yes) else (no)
-
so on…..
Set of rules can be visualized as a tree.
Set of rules can be visualized as a tree.
Rule 1: 
If (
(lookout=sunny) 
and 
( humudity=high))
             
then (yes) else (no)
Set of rules can be visualized as a tree.
Rule 1: 
If (
(lookout=sunny) 
and 
( humudity=high))
             
then (yes) else (no)
Rule 2:
 If 
(lookout=overcast) then (yes)
Set of rules can be visualized as a tree.
Rule 1: 
If (
(lookout=sunny) 
and 
( humudity=high))
             
then (yes) else (no)
Rule 2:
 If 
(lookout=overcast) then (yes)
Rule 3:
 If (
(lookout=
rain
) 
and 
( 
windy
=true))
             
then (
no
) else (
yes
)
Many possible Trees
Many possible Trees
Which Tree is the Best?
Which feature should be used to break the dataset?
Types of DT
ID3 (Iterative Dichotomiser 3)
C4.5 (Successor of ID3)
CART (Classification and Regression Tree)
Random Forest
ID3
ID3
1.
Calculate the entropy of the total
dataset => H(S)=0.9911
we get a database with single class
Entropy
(4
F
,5
M
) = -(4/9)log
2
(4/9) - (5/9)log
2
(5/9)
 
            =  
0.9911
 
ID3
1.
Calculate the entropy of the total
dataset
2.
Choose and attribute and Split the
dataset by an attribute
we get a database with single class
ID3
1.
Calculate the entropy of the total dataset
2.
Choose and attribute and Split the dataset by
an attribute
3.
Calculate the entropy of each branch
get a database with single class
Entropy
(1
F
,3
M
) = -(1/4)log
2
(1/4) - (3/4)log
2
(3/4)
 
            =  
0.8113
Entropy
(3
F
,2
M
) = -(3/5)log
2
(3/5) - (2/5)log
2
(2/5)
 
            =  
0.9710
ID3
1.
Calculate the entropy of the total dataset
2.
Choose and attribute and Split the dataset by
an attribute
3.
Calculate the entropy of each branch
4.
Calculate Information Gain of the split
get a database with single class
Entropy
(1
F
,3
M
) = -(1/4)log
2
(1/4) - (3/4)log
2
(3/4)
 
            =  
0.8113
Entropy
(3
F
,2
M
) = -(3/5)log
2
(3/5) - (2/5)log
2
(2/5)
 
            =  
0.9710
Gain
(Hair Length <= 5) = 
0.9911
 – (4/9 * 
0.8113
 + 5/9 * 
0.9710
 ) = 
0.0911
What is Information Gain?
What is Information Gain?
Which split is better?
What is information gain?
What is information gain?
Reduction in uncertainty of the parent dataset
after the split.
ID3
1.
Calculate the entropy of the total dataset
2.
Choose and attribute and Split the dataset by
an attribute
3.
Calculate the entropy of each branch
4.
Calculate Information Gain of the split
get a database with single class
Entropy
(1
F
,3
M
) = -(1/4)log
2
(1/4) - (3/4)log
2
(3/4)
 
            =  
0.8113
Entropy
(3
F
,2
M
) = -(3/5)log
2
(3/5) - (2/5)log
2
(2/5)
 
            =  
0.9710
Gain
(Hair Length <= 5) = 
0.9911
 – (4/9 * 
0.8113
 + 5/9 * 
0.9710
 ) = 
0.0911
ID3
1.
Calculate the entropy of the total dataset
2.
Choose and attribute and Split the dataset by
an attribute
3.
Calculate the entropy of each branch
4.
Calculate Information Gain of the split
5.
Repeat 2, 3, 4 for all Attributes
6.
The attribute that yields the largest IG is chosen
for the decision node.
Gain
(Hair Length <= 5) = 
0.0911
Gain
(Weight <= 160) = 
0.5900
Gain
(Age <= 40) = 
0.0183
ID3
1.
Calculate the entropy of the total dataset
2.
Choose and attribute and Split the dataset by
an attribute
3.
Calculate the entropy of each branch
4.
Calculate Information Gain of the split
5.
Repeat 2, 3, 4 for all Attributes
6.
The attribute that yields the largest IG is chosen
for the decision node.
ID3
1.
Calculate the entropy of the total dataset
2.
Choose and attribute and Split the dataset by an
attribute
3.
Calculate the entropy of each branch
4.
Calculate Information Gain of the split
5.
Repeat 2, 3, 4 for all Attributes
6.
The attribute that yields the largest IG is chosen for the
decision node.
7.
Repeat 1 to 6 for all sub-databases till we get sub-
databases with single class
ID3
1.
Calculate the entropy of the total dataset
2.
Choose and attribute and Split the dataset by an
attribute
3.
Calculate the entropy of each branch
4.
Calculate Information Gain of the split
5.
Repeat 2, 3, 4 for all Attributes
6.
The attribute that yields the largest IG is chosen for the
decision node.
7.
Repeat 1 to 6 for all sub-databases till we get sub-
databases with single class
Slide Note
Embed
Share

Explore the concept of Decision Trees through a rule-based approach. Learn how to predict outcomes based on a set of rules and visualize the rule sets as trees. Discover different types of Decision Trees and understand which features to use for breaking datasets effectively.

  • Decision Trees
  • Rule-Based Approach
  • Data Visualization
  • Predictive Modeling
  • Machine Learning

Uploaded on Aug 30, 2024 | 3 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Lesson 5 Decision Tree (Rule Based Approach)

  2. Example

  3. Example Features

  4. Example Class

  5. Example Given : <sunny, cool, high, true> Predict, if there will be a match?

  6. Example Given : <sunny, cool, high, true> Predict, if there will be a match? Assume that I have a set of rules: - If ((lookout=sunny) and ( humudity=high) and (windy=false)) then (yes) else (no) - If (lookout=overcast) then (yes) If ((lookout=sunny) and ( humudity=high)) then (yes) else (no) so on .. - -

  7. Set of rules can be visualized as a tree.

  8. Set of rules can be visualized as a tree. Rule 1: If ((lookout=sunny) and ( humudity=high)) then (yes) else (no)

  9. Set of rules can be visualized as a tree. Rule 1: If ((lookout=sunny) and ( humudity=high)) then (yes) else (no) Rule 2: If (lookout=overcast) then (yes)

  10. Set of rules can be visualized as a tree. Rule 1: If ((lookout=sunny) and ( humudity=high)) then (yes) else (no) Rule 2: If (lookout=overcast) then (yes) Rule 3: If ((lookout=rain) and ( windy=true)) then (no) else (yes)

  11. Many possible Trees

  12. Many possible Trees Which Tree is the Best?

  13. Which feature should be used to break the dataset? Types of DT ID3 (Iterative Dichotomiser 3) C4.5 (Successor of ID3) CART (Classification and Regression Tree) Random Forest

  14. ID3

  15. ID3 1. Calculate the entropy of the total dataset => H(S)=0.9911 we get a database with single class p+nlog2( p+n) p+nlog2( p+n) p p n n Entropy(S)= Entropy(4F,5M) = -(4/9)log2(4/9) - (5/9)log2(5/9) = 0.9911

  16. ID3 1. Calculate the entropy of the total dataset 2. Choose and attribute and Split the dataset by an attribute we get a database with single class

  17. ID3 Calculate the entropy of the total dataset Choose and attribute and Split the dataset by an attribute Calculate the entropy of each branch get a database with single class 1. 2. 3. Entropy(3F,2M) = -(3/5)log2(3/5) - (2/5)log2(2/5) = 0.9710 Entropy(1F,3M) = -(1/4)log2(1/4) - (3/4)log2(3/4) = 0.8113

  18. ID3 Calculate the entropy of the total dataset Choose and attribute and Split the dataset by an attribute Calculate the entropy of each branch Calculate Information Gain of the split 1. 2. 3. 4. get a database with single class ??(?1) = ? ? [? ?1? ?1 + ? ?2? ?2] Gain(Hair Length <= 5) = 0.9911 (4/9 * 0.8113 + 5/9 * 0.9710 ) = 0.0911 Entropy(3F,2M) = -(3/5)log2(3/5) - (2/5)log2(2/5) = 0.9710 Entropy(1F,3M) = -(1/4)log2(1/4) - (3/4)log2(3/4) = 0.8113

  19. What is Information Gain?

  20. What is Information Gain?

  21. Which split is better?

  22. What is information gain?

  23. What is information gain? Reduction in uncertainty of the parent dataset after the split. ??(?1) = ? ? [? ?1? ?1 + ? ?2? ?2]

  24. ID3 Calculate the entropy of the total dataset Choose and attribute and Split the dataset by an attribute Calculate the entropy of each branch Calculate Information Gain of the split 1. 2. 3. 4. get a database with single class ??(?1) = ? ? [? ?1? ?1 + ? ?2? ?2] Gain(Hair Length <= 5) = 0.9911 (4/9 * 0.8113 + 5/9 * 0.9710 ) = 0.0911 Entropy(3F,2M) = -(3/5)log2(3/5) - (2/5)log2(2/5) = 0.9710 Entropy(1F,3M) = -(1/4)log2(1/4) - (3/4)log2(3/4) = 0.8113

  25. ID3 Calculate the entropy of the total dataset Choose and attribute and Split the dataset by an attribute Calculate the entropy of each branch 1. 2. 3. Calculate Information Gain of the split 4. Repeat 2, 3, 4 for all Attributes The attribute that yields the largest IG is chosen for the decision node. 5. 6. Gain(Hair Length <= 5) = 0.0911 Gain(Weight <= 160) = 0.5900 Gain(Age <= 40) = 0.0183

  26. ID3 Calculate the entropy of the total dataset Choose and attribute and Split the dataset by an attribute Calculate the entropy of each branch 1. 2. 3. Calculate Information Gain of the split 4. Repeat 2, 3, 4 for all Attributes The attribute that yields the largest IG is chosen for the decision node. 5. 6.

  27. ID3 Calculate the entropy of the total dataset 1. Choose and attribute and Split the dataset by an attribute 2. Calculate the entropy of each branch 3. Calculate Information Gain of the split 4. Repeat 2, 3, 4 for all Attributes 5. The attribute that yields the largest IG is chosen for the decision node. 6. Repeat 1 to 6 for all sub-databases till we get sub- databases with single class 7.

  28. ID3 Calculate the entropy of the total dataset 1. Choose and attribute and Split the dataset by an attribute 2. Calculate the entropy of each branch 3. Calculate Information Gain of the split 4. Repeat 2, 3, 4 for all Attributes 5. The attribute that yields the largest IG is chosen for the decision node. 6. Repeat 1 to 6 for all sub-databases till we get sub- databases with single class 7.

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#