Introduction to IBM SPSS Modeler: Association Analysis and Market Basket Analysis

undefined
Data Mining Concepts
Introduction to Undirected Data Mining: Association Analysis
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
1
IBM SPSS
Association Analysis
 
A
l
s
o
 
r
e
f
e
r
r
e
d
 
t
o
 
a
s
A
f
f
i
n
i
t
y
 
A
n
a
l
y
s
i
s
M
a
r
k
e
t
 
 
B
a
s
k
e
t
 
A
n
a
l
y
s
i
s
F
o
r
 
M
B
A
,
 
b
a
s
i
c
a
l
l
y
 
m
e
a
n
s
 
w
h
a
t
 
i
s
b
e
i
n
g
 
p
u
r
c
h
a
s
e
d
 
t
o
g
e
t
h
e
r
A
s
s
o
c
i
a
t
i
o
n
 
r
u
l
e
s
 
r
e
p
r
e
s
e
n
t
p
a
t
t
e
r
n
s
 
w
i
t
h
o
u
t
 
a
 
s
p
e
c
i
f
i
c
 
t
a
r
g
e
t
;
t
h
u
s
 
u
n
d
i
r
e
c
t
e
d
 
o
r
 
u
n
s
u
p
e
r
v
i
s
e
d
d
a
t
a
 
m
i
n
i
n
g
F
i
t
s
 
i
n
 
t
h
e
 
E
x
p
l
o
r
a
t
o
r
y
 
c
a
t
e
g
o
r
y
 
o
f
d
a
t
a
 
m
i
n
i
n
g
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
2
Association Rules
Other potential uses
Items purchases on credit card give insight to next
produce or service purchased
Help determine bundles for telcoms
Help bankers determine identify customers for other
services
Unusual combinations of things like insurance claims
may need further investigation
Medical histories may give indications of complications
or helpful combinations for patients
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
3
Defining MBA
MBA data
Customers
Purchases (baskets or item sets)
Items
Figure 9-3 set of tables
Purchase (Order) is the fundamental data structure
Individual items are line items
Product –descriptive info
Customer info can be helpful
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
4
Levels of Data
Adapted from Barry & Linoff
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
5
MBA
The three levels of data are important for MBA.  They can
be used to answer a number of questions
Average number of baskets/customer/time unit
Average unique items per customer
Average number of items per basket
For a given product, what is the proportion of customers who
have ever purchased the product?
For a given product, what is the average number of baskets per
customer that include the item
For a given  product, what is the average quantity purchased in
an order when the product is purchased?
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
6
Item Popularity
Most common item in one-item baskets
Most common item in multi-item baskets
Most common items among repeat customers
Change in buying patterns of item over time
Buying pattern for an item by region
Time and geography are two of the most
important attributes of MBA data
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
7
Tracking Market Interventions
Adapted from Barry & Linoff
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
8
Association Rules
Actionable Rules
Wal-Mart customers who purchase Barbie dolls have a
60 percent likelihood of also purchasing one of three
types of candy bars
Trivial Rules
Customers who purchase maintenance agreements
are very likely to purchase a large appliance
Inexplicable Rules
When a new hardware store opens, one of the most
commonly sold items is toilet cleaners
Adapted from Barry & Linoff
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
9
What exactly is an Association Rule?
Of the form:
IF
 
 
antecedent
 
THEN
 
consequent
If (orange juice, milk) Then (bread, bacon)
Rules include measure of support and confidence
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
10
How good is an Association Rule?
Transactions can be  converted to Co-occurrence
matrices
Co-occurrence tables highlight simple patterns
Confidence and support can be directly
determined from a co-occurrence table
Or by counting via SQL, etc.
DM software makes the presentation easy
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
11
Co-Occoncurrence Table
 
Customer   
 
Items
  
1
 
Orange juice, soda
  
2
 
Milk, orange juice, window cleaner
  
3
 
Orange juice, detergent
  
4
 
Orange juice, detergent, soda
  
5
 
Window cleaner, milk
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
12
Co-Occoncurrence Table
 
Customer   
 
Items
  
1
 
Orange juice, soda
  
2
 
Milk, orange juice, window cleaner
  
3
 
Orange juice, detergent
  
4
 
Orange juice, detergent, soda
  
5
 
Window cleaner, milk
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
13
Confidence, Support and Lift
Support for the rule
 
# records with both antecedent and consequent
 
           Total # records
 
Confidence for the rule
 
# records with both antecedent and consequent
 
           # records of the antecedent
Expected Confidence
# records of the consequent
      Total # records
Lift
 
 
Confidence / Expected  Confidence
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
14
Confidence and Support
Rule:  If soda then orange juice
 
From the co-occurrence table, soda and orange juice occur together 2
times (out of 5 total transactions)
Thus, support for the rule is 2/5 or 40%
Confidence for the rule:
 
Soda occurs  2 times; so confidence of orange juice given soda would
be 2/2 or 100%
Lift for the rule: Confidence / Expected Confidence
 
confidence = 100%; expected confidence=80%
 
lift = 1.0/.8 = 1.25
Rule:  If orange juice then soda
 
support for the rule is the same—40%
 
 
orange juice occurs 4 times; so confidence of soda given orange juice
is 2/4 or 50%
 
lift = .5/.8
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
15
Building Association Rules
Adapted from Barry & Linoff
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
16
Product Hierarchies
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
17
Lessons Learned
MBA is complex and no one technique is powerful
enough to provide all the answers.
Three levels—Order (basket), line items and
customer
MBA can answer a number of questions
Association rules most common technique for
MBA
Generate rules--support, confidence and lift
Prepared by David Douglas, University of Arkansas
Hosted by the University of Arkansas
18
Slide Note
Embed
Share

Understanding Association Analysis in IBM SPSS Modeler 14.2, also known as Affinity Analysis or Market Basket Analysis. Learn about identifying patterns in data without specific targets, exploring data mining in an unsupervised manner. Discover the uses of Association Rules, including insights into customer purchasing behavior and potential applications in various industries. Explore the fundamentals of Market Basket Analysis data structure, defining MBA data, and the importance of different levels of data in MBA analysis.

  • IBM SPSS Modeler
  • Association Analysis
  • Market Basket Analysis
  • Data Mining
  • MBA

Uploaded on Sep 26, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. IBM SPSS Modeler 14.2 IBM SPSS Data Mining Concepts Introduction to Undirected Data Mining: Association Analysis 1 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  2. IBM SPSS Modeler 14.2 Association Analysis Also referred to as Affinity Analysis Market Basket Analysis For MBA, basically means what is being purchased together Association rules represent patterns without a specific target; thus undirected or unsupervised data mining Fits in the Exploratory category of data mining 2 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  3. IBM SPSS Modeler 14.2 Association Rules Other potential uses Items purchases on credit card give insight to next produce or service purchased Help determine bundles for telcoms Help bankers determine identify customers for other services Unusual combinations of things like insurance claims may need further investigation Medical histories may give indications of complications or helpful combinations for patients 3 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  4. IBM SPSS Modeler 14.2 Defining MBA MBA data Customers Purchases (baskets or item sets) Items Figure 9-3 set of tables Purchase (Order) is the fundamental data structure Individual items are line items Product descriptive info Customer info can be helpful 4 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  5. IBM SPSS Modeler 14.2 Levels of Data Adapted from Barry & Linoff 5 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  6. IBM SPSS Modeler 14.2 MBA The three levels of data are important for MBA. They can be used to answer a number of questions Average number of baskets/customer/time unit Average unique items per customer Average number of items per basket For a given product, what is the proportion of customers who have ever purchased the product? For a given product, what is the average number of baskets per customer that include the item For a given product, what is the average quantity purchased in an order when the product is purchased? 6 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  7. IBM SPSS Modeler 14.2 Item Popularity Most common item in one-item baskets Most common item in multi-item baskets Most common items among repeat customers Change in buying patterns of item over time Buying pattern for an item by region Time and geography are two of the most important attributes of MBA data 7 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  8. IBM SPSS Modeler 14.2 Tracking Market Interventions Adapted from Barry & Linoff 8 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  9. IBM SPSS Modeler 14.2 Association Rules Actionable Rules Wal-Mart customers who purchase Barbie dolls have a 60 percent likelihood of also purchasing one of three types of candy bars Trivial Rules Customers who purchase maintenance agreements are very likely to purchase a large appliance Inexplicable Rules When a new hardware store opens, one of the most commonly sold items is toilet cleaners Adapted from Barry & Linoff 9 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  10. IBM SPSS Modeler 14.2 What exactly is an Association Rule? Of the form: IF antecedent THEN consequent If (orange juice, milk) Then (bread, bacon) Rules include measure of support and confidence 10 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  11. IBM SPSS Modeler 14.2 How good is an Association Rule? Transactions can be converted to Co-occurrence matrices Co-occurrence tables highlight simple patterns Confidence and support can be directly determined from a co-occurrence table Or by counting via SQL, etc. DM software makes the presentation easy 11 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  12. IBM SPSS Modeler 14.2 Co-Occoncurrence Table Customer Items Orange juice, soda Milk, orange juice, window cleaner Orange juice, detergent Orange juice, detergent, soda Window cleaner, milk 1 2 3 4 5 OJ WC Milk Soda Det OJ WC Milk Soda Det - - - - - - - - - - 12 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  13. IBM SPSS Modeler 14.2 Co-Occoncurrence Table Customer Items Orange juice, soda Milk, orange juice, window cleaner Orange juice, detergent Orange juice, detergent, soda Window cleaner, milk 1 2 3 4 5 OJ 4 - - - - WC 1 2 - - - Milk 1 2 2 - - Soda 2 0 0 2 - Det 2 0 0 1 2 OJ WC Milk Soda Det 13 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  14. IBM SPSS Modeler 14.2 Confidence, Support and Lift Support for the rule # records with both antecedent and consequent Total # records Confidence for the rule # records with both antecedent and consequent # records of the antecedent Expected Confidence # records of the consequent Total # records Lift Confidence / Expected Confidence 14 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  15. IBM SPSS Modeler 14.2 Confidence and Support Rule: If soda then orange juice From the co-occurrence table, soda and orange juice occur together 2 times (out of 5 total transactions) Thus, support for the rule is 2/5 or 40% Confidence for the rule: Soda occurs 2 times; so confidence of orange juice given soda would be 2/2 or 100% Lift for the rule: Confidence / Expected Confidence confidence = 100%; expected confidence=80% lift = 1.0/.8 = 1.25 Rule: If orange juice then soda support for the rule is the same 40% orange juice occurs 4 times; so confidence of soda given orange juice is 2/4 or 50% lift = .5/.8 15 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  16. IBM SPSS Modeler 14.2 Building Association Rules Adapted from Barry & Linoff 16 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  17. IBM SPSS Modeler 14.2 Product Hierarchies 17 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

  18. IBM SPSS Modeler 14.2 Lessons Learned MBA is complex and no one technique is powerful enough to provide all the answers. Three levels Order (basket), line items and customer MBA can answer a number of questions Association rules most common technique for MBA Generate rules--support, confidence and lift 18 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#