Model Evaluation in Business Intelligence and Analytics

Business Intelligence and Analytics:
What is a good model?
Session 8
 
What
 
i
s
 
desire
d
 
f
ro
m
 
dat
a
 
mini
n
g
 
results?
 
H
o
w
 
w
o
u
l
d
 
y
o
u
 
m
e
a
s
u
r
e
 
t
h
a
t
 
y
o
u
r
 
m
o
d
e
l
 
i
s
 
a
n
y
 
g
o
o
d
?
 
 
H
ow
 
t
o
 
mea
s
u
re
 
pe
r
formance
 
i
n
 
a
 
meaningful
 
w
a
y
?
 
M
o
d
e
l
 
e
v
a
l
u
a
t
i
o
n
 
i
s
 
a
p
p
l
i
c
a
t
i
o
n
-
s
p
e
c
i
f
i
c
 
 
We
 
loo
k
 
at
 
c
o
mmon
 
is
s
ue
s
 
and
 
themes
 
i
n
 
evaluation
 
Frameworks
 
an
d
 
met
rics
 
for
 
cl
a
ssific
a
tion
 
and
 
i
n
stance
scoring
In
t
r
o
d
uc
t
io
n
 
C
l
assific
a
tion
 
ter
mino
l
ogy
 
 
a
 
b
a
d
 
o
u
t
c
o
m
e
 
 
a
 
p
o
s
i
t
i
v
e
 
e
x
a
m
p
l
e
 
[
a
l
a
r
m
!
]
 
 
a
 
g
o
o
d
 
o
u
t
c
o
m
e
 
 
a
 
n
e
g
a
t
i
v
e
 
e
x
a
m
p
l
e
 
[
u
n
i
n
t
e
r
e
s
t
i
n
g
]
 
Further
 
e
x
a
m
ples
 
 
medi
c
a
l
 
tes
t
:
 
po
s
it
i
ve
 
te
s
t
 
 
di
s
ea
s
e
 
i
s
 
p
r
e
s
ent
 
 
fraud
 
d
e
te
c
to
r
:
 
po
s
it
i
ve
 
test
 
 
unu
s
ua
l
 
a
c
ti
v
it
y
 
on
 
a
c
c
o
unt
 
A
 
c
l
a
s
s
i
f
i
e
r
 
t
r
i
e
s
 
t
o
 
d
i
s
t
i
n
g
u
i
s
h
 
t
h
e
 
m
a
j
o
r
i
t
y
 
o
f
 
c
a
s
e
s
 
(
n
e
g
a
t
i
v
e
s
,
 
t
h
e
u
n
i
n
t
e
r
e
s
t
i
n
g
)
 
f
r
o
m
 
t
h
e
 
s
m
a
l
l
 
n
u
m
b
e
r
 
o
f
 
a
l
a
r
m
i
n
g
 
c
a
s
e
s
 
(
p
o
s
i
t
i
v
e
s
,
a
l
a
r
m
i
n
g
)
 
 
n
u
m
b
e
r
 
o
f
 
m
i
s
t
a
k
e
s
 
m
a
d
e
 
o
n
 
n
e
g
a
t
i
v
e
 
e
x
a
m
p
l
e
s
 
(
f
a
l
s
e
 
p
o
s
i
t
i
v
e
 
e
r
r
o
r
s
)
 
w
i
l
l
b
e
 
r
e
l
a
t
i
v
e
l
y
 
h
i
g
h
 
 
c
o
s
t
 
o
f
 
e
a
c
h
 
m
i
s
t
a
k
e
 
m
a
d
e
 
o
n
 
a
 
p
o
s
i
t
i
v
e
 
e
x
a
m
p
l
e
 
(
f
a
l
s
e
 
n
e
g
a
t
i
v
e
 
e
r
r
o
r
)
 
w
i
l
l
b
e
 
r
e
l
a
t
i
v
e
l
y
 
h
i
g
h
Bad
 
p
os
i
tiv
e
s
 
an
d
 
h
armless
n
e
g
a
tiv
es
 
M
e
a
s
u
r
i
n
g
 
a
c
c
u
r
a
c
y
 
 
C
o
nfusio
n
 
matrix
 
 
U
n
bala
nc
e
d
 
clas
s
es
 
A
 
key
 
an
a
lytica
l
 
f
ramewor
k
:
 
E
x
pecte
d
 
val
u
e
 
 
E
v
aluat
e
 
clas
s
if
i
e
r
 
u
s
e
 
 
Frame
 
clas
s
if
i
e
r
 
evaluation
 
Ev
a
l
u
atio
n
 
an
d
 
bas
e
l
i
n
e
 
perfo
r
mance
Age
n
da
 
U
p
 
t
o
 
now
:
 
measure
 
a
 
model
s
 
perfo
r
mance
 
by
some
 
sim
p
l
e
 
metric
 
 
clas
s
if
i
e
r
 
e
rror
 
rate,
 
a
c
c
u
rac
y
,
 
 
S
i
mple
 
e
x
a
m
pl
e
:
 
accuracy
 
C
l
a
s
s
i
f
i
c
a
t
i
o
n
 
a
c
c
u
r
a
c
y
 
i
s
 
p
o
p
u
l
a
r
,
 
b
u
t
 
u
s
u
a
l
l
y
 
t
o
o
s
i
m
p
l
i
s
t
i
c
 
f
o
r
 
a
p
p
l
i
c
a
t
i
o
n
s
 
o
f
 
d
a
t
a
 
m
i
n
i
n
g
 
t
o
 
r
e
a
l
b
u
s
i
n
e
s
s
 
p
r
o
b
l
e
m
s
 
D
e
c
o
m
p
o
s
e
 
a
n
d
 
c
o
u
n
t
 
t
h
e
 
d
i
f
f
e
r
e
n
t
 
t
y
p
e
s
 
o
f
 
c
o
r
r
e
c
t
a
n
d
 
i
n
c
o
r
r
e
c
t
 
d
e
c
i
s
i
o
n
s
 
m
a
d
e
 
b
y
 
a
 
c
l
a
s
s
i
f
i
e
r
Meas
u
rin
g
 
accurac
y
 
an
d
its
 
p
r
o
ble
ms
 
A
 
confus
i
o
n
 
mat
rix
 
for
 
a
 
prob
l
e
m
 
i
n
volv
i
n
g
 
n classes
 
Is an nxn matrix
 
with the columns
 
labele
d
 
wit
h
actual
 
clas
s
e
s
 
and
 
the
 
ro
w
s
 
la
b
el
s
 
wit
h
 
p
r
edi
c
ted
 
clas
s
es
 
E
a
c
h
 
e
x
a
m
p
l
e
 
i
n
 
a
 
t
e
s
t
 
s
e
t
 
h
a
s
 
a
n
 
a
c
t
u
a
l
 
c
l
a
s
s
l
a
b
e
l
 
a
n
d
 
t
h
e
 
c
l
a
s
s
 
p
r
e
d
i
c
t
e
d
 
b
y
 
t
h
e
 
c
l
a
s
s
i
f
i
e
r
 
Th
e
 
confus
i
o
n
 
mat
rix
 
separates
 
ou
t
 
the
 
dec
i
si
o
ns
made
 
b
y
 
the
 
cl
a
ss
i
fier
 
 
a
c
t
u
a
l
/
t
r
u
e
 
c
l
a
s
s
e
s
:
 
p
(
o
s
i
t
i
v
e
)
,
 
n
(
e
g
a
t
i
v
e
)
 
 
p
r
e
d
i
c
t
e
d
 
c
l
a
s
s
e
s
:
 
Y
(
e
s
)
,
 
N
(
o
)
 
 
The
 
main
 
diag
o
na
l
 
c
o
ntain
s
 
the
 
c
o
un
t
 
of
 
c
o
r
r
ec
t
 
de
c
isio
n
s
The
 
co
nf
u
s
i
o
n
 
mat
r
ix
 
In
 
pract
ic
a
l
 
cl
a
ssific
a
tion
 
prob
l
ems
,
 
on
e
 
cl
a
ss
 
is
o
f
t
e
n
 
r
a
r
e
 
 
C
l
a
s
s
i
f
i
c
a
t
i
o
n
 
i
s
 
u
s
e
d
 
t
o
 
f
i
n
d
 
a
 
r
e
l
a
t
i
v
e
l
y
 
s
m
a
l
l
 
n
u
m
b
e
r
 
o
f
u
n
u
s
u
a
l
 
o
n
e
s
 
(
d
e
f
r
a
u
d
e
d
 
c
u
s
t
o
m
e
r
s
,
 
d
e
f
e
c
t
i
v
e
 
p
a
r
t
s
,
t
a
r
g
e
t
i
n
g
 
c
o
n
s
u
m
e
r
s
 
w
h
o
 
a
c
t
u
a
l
l
y
 
w
o
u
l
d
 
r
e
s
p
o
n
d
,
 
)
 
 
The
 
class
 
distribution
 
is unb
a
lan
c
ed
 
(
s
k
ewed”)
 
E
v
a
l
u
a
t
i
o
n
 
b
a
s
e
d
 
o
n
 
a
c
c
u
r
a
c
y
 
d
o
e
s
 
n
o
t
 
w
o
r
k
 
 
E
x
ampl
e
:
 
999:
1
 
ratio
 
 
alw
a
ys
 
c
h
oo
s
e
 
the
 
most
 
p
r
evalent
class
 
99.9
%
 
a
c
c
u
rac
y
!
 
 
Fraud
 
detection
:
 
s
k
ew
s
 
of
 
10²
 
 
I
s
 
a
 
model
 
wit
h
 
80
%
 
a
c
c
u
racy
 
alw
a
ys
 
b
e
t
t
e
r
 
than
 
a
 
model
wit
h
 
37
%
 
a
c
c
u
rac
y
?
 
We
 
nee
d
 
t
o
 
know
 
more
 
deta
i
l
s
 
abou
t
 
the
 
po
p
u
l
ation
Unb
a
la
n
c
e
d
 
c
l
a
s
ses
 
(
1/3)
 
C
o
ns
i
d
e
r
 
two
 
mod
e
l
s
 
A
 
an
d
 
B
 
for
 
the
 
ch
u
rn
e
x
ampl
e
 
(1000
 
customers
,
 
1:
9
 
ratio
 
o
f
 
churni
n
g
)
 
 
Bo
t
h
 
mo
d
el
s
 
c
o
r
r
ec
t
l
y
 
clas
s
if
y
 
80
%
 
of
 
the
 
bala
nc
e
d
 
pop.
 
 
Cla
s
sifier
 
A
 
of
t
e
n
 
falsely
 
p
r
edi
c
ts
 
that
 
c
u
s
tomers
 
wil
l
 
c
h
u
rn
 
 
Cla
s
sifier
 
B
 
mak
e
s
 
many
 
opp
o
site
 
e
rrors
Unb
a
la
n
c
e
d
 
c
l
a
s
ses
 
(
2/3)
 
N
o
t
e
 
t
h
e
 
d
i
f
f
e
r
e
n
t
 
p
e
r
f
o
r
m
a
n
c
e
s
 
o
f
 
t
h
e
 
m
o
d
e
l
s
 
i
n
form
 
o
f
 
a
 
co
n
fusi
o
n
 
mat
r
i
x
:
 
Model
 
A
 
ach
i
eve
s
 
80
%
 
accurac
y
 
o
n
 
the
 
ba
l
anc
e
d
samp
l
e
 
U
n
ba
l
anc
e
d
 
po
p
u
l
ation
:
 
A‘
s
 
accurac
y
 
i
s
 
37%,
B‘
s 
accurac
y
 
i
s
 
93%
 
Which
 
m
od
e
l
 
i
s
 
bette
r
?
Unb
a
la
n
c
e
d
 
c
l
a
s
ses
 
(
3
/
3
)
10
 
H
o
w
 
m
u
c
h
 
d
o
 
w
e
 
c
a
r
e
 
a
b
o
u
t
 
t
h
e
 
d
i
f
f
e
r
e
n
t
 
e
r
r
o
r
s
an
d
 
correct
 
d
e
ci
s
i
o
n
s
?
 
 
C
l
a
s
s
i
f
i
c
a
t
i
o
n
 
a
c
c
u
r
a
c
y
 
m
a
k
e
s
 
n
o
 
d
i
s
t
i
n
c
t
i
o
n
 
b
e
t
w
e
e
n
 
f
a
l
s
e
p
o
s
i
t
i
v
e
 
a
n
d
 
f
a
l
s
e
 
n
e
g
a
t
i
v
e
 
e
r
r
o
r
s
 
 
In
 
re
a
l
-
w
orld
 
appli
c
ation
s
,
 
dif
f
e
r
en
t
 
kin
d
s
 
of
 
e
rrors
 
lea
d
 
to
dif
f
e
r
en
t
 
c
o
n
s
equ
e
nces
!
 
E
x
ample
s
 
for
 
medical
 
d
i
ag
n
osi
s:
 
 
a
 
patien
t
 
has
 
c
a
n
c
e
r
 
(
alth
o
u
g
h
 
he
 
does
 
not)
 
f
a
l
s
e
 
p
o
s
i
t
i
v
e
 
e
r
r
o
r
,
 
e
x
p
e
n
s
i
v
e
,
 
b
u
t
 
n
o
t
 
l
i
f
e
 
t
h
r
e
a
t
e
n
i
n
g
 
 
a
 
pat
i
en
t
 
has
 
can
c
e
r
,
 
bu
t
 
she
 
i
s
 
to
l
d
 
that
 
she
 
has
 
not
 
f
a
l
s
e
 
n
e
g
a
t
i
v
e
 
e
r
r
o
r
,
 
m
o
r
e
 
s
e
r
i
o
u
s
 
Errors
 
shou
l
d
 
b
e
 
counted
 
separately
 
 
Estima
t
e
 
c
o
st
 
or
 
ben
e
fit
 
of
 
ea
c
h
 
de
c
ision
Une
q
u
a
l
 
c
o
s
ts
 
an
d
 
b
e
n
e
fi
t
s
 
A
n
othe
r
 
e
x
ampl
e:
 
ho
w
 
t
o
 
measure
 
the
 
accurac
y
 
/
q
u
a
l
it
y
 
o
f
 
a
 
regress
i
o
n
 
mod
el
?
 
 
Predi
c
t
 
how
 
much
 
a
 
give
n
 
c
u
stomer
 
wil
l
 
lik
e
 
a
 
give
n
 
mov
i
e
 
Ty
p
ic
a
l
 
accurac
y
 
o
f
 
regress
i
on
:
 
mea
n
-sq
u
are
d
 
error
 
What
 
doe
s
 
the
 
mea
n
-
squared
 
er
r
o
r
 
describ
e
?
 
 
Value
 
of
 
the
 
target
 
variable,
 
e.g.
,
 
the
 
numbe
r
 
of
 
sta
r
s
 
that
 
a
 
u
s
e
r
 
w
o
ul
d
giv
e
 
as
 
a
 
r
a
ting
 
for
 
the
 
mov
i
e
 
I
s
 
the
 
mea
n
-squared
 
er
r
o
r
 
a
 
meani
n
gfu
l
 
met
ric?
A
 
lo
o
k
 
beyon
d
 
c
l
a
s
s
i
fic
at
i
on
 
Meas
u
ring
 
acc
u
racy
 
 
C
o
nfusio
n
 
matrix
 
 
U
n
bala
nc
e
d
 
clas
s
es
 
A
 
k
e
y
 
a
n
a
l
y
t
i
c
a
l
 
f
r
a
m
e
w
o
r
k
:
 
E
x
p
e
c
t
e
d
 
v
a
l
u
e
 
 
E
v
aluat
e
 
clas
s
if
i
e
r
 
u
s
e
 
 
Frame
 
clas
s
if
i
e
r
 
evaluation
 
Ev
a
l
u
atio
n
 
an
d
 
bas
e
l
i
n
e
 
perfo
r
mance
Age
n
da
 
E
x
p
e
c
t
e
d
 
v
a
l
u
e
 
c
a
l
c
u
l
a
t
i
o
n
 
i
n
c
l
u
d
e
s
 
e
n
u
m
e
r
a
t
i
o
n
o
f
 
t
h
e
 
p
o
s
s
i
b
l
e
 
o
u
t
c
o
m
e
s
 
o
f
 
a
 
s
i
t
u
a
t
i
o
n
 
E
x
pecte
d
 
val
u
e
 
=
 
w
e
i
g
hte
d
 
averag
e
 
o
f
 
the
 
val
u
e
s
 
of
d
if
feren
t
 
poss
i
b
l
e
 
outcome
s
,
 
w
h
er
e
 
the
 
w
e
i
g
h
t
 
g
i
ven
t
o
 
eac
h
 
value
 
i
s
 
the
 
probab
i
l
ity
 
o
f
 
it
s
 
occurrence
 
 
Ex
ampl
e:
 
di
f
ferent
 
le
v
el
s
 
of
 
profit
 
 
We
 
focus
 
on
 
the
 
max
i
mization
 
of
 
expecte
d
 
p
r
ofit
 
G
e
n
e
r
a
l
 
f
o
r
m
 
o
f
 
e
x
p
e
c
t
e
d
 
v
a
l
u
e
 
c
o
m
p
u
t
a
t
i
o
n
:
 
P
r
o
b
a
b
i
l
i
t
i
e
s
 
c
a
n
 
b
e
 
e
s
t
i
m
a
t
e
d
 
f
r
o
m
 
a
v
a
i
l
a
b
l
e
 
d
a
t
a
The
 
e
x
p
ec
t
e
d
 
valu
e
f
r
a
m
ew
o
rk
Exp
ec
t
e
d
 
valu
e
 
for
 
us
e
 
o
f
 
a
c
l
a
s
s
i
fi
e
r
 
(
1/2)
 
 
We
 
shou
l
d
 
target
 
the
 
c
on
s
ume
r
 
as
 
lon
g
 
as
 
the
 
estima
t
ed
p
r
oba
b
ili
t
y
 
of
 
res
p
ondi
ng
 
i
s
 
g
r
eate
r
 
than
 
1%!
Exp
ec
t
e
d
 
valu
e
 
for
 
us
e
 
o
f
 
a
c
l
a
s
s
i
fi
e
r
 
(
2
/
2
)
 
G
o
a
l
:
 
c
o
m
p
a
r
e
 
t
h
e
 
q
u
a
l
i
t
y
 
o
f
 
d
i
f
f
e
r
e
n
t
 
m
o
d
e
l
s
w
i
th
 
e
a
ch
 
other
 
 
D
o
e
s
 
the
 
dat
a-
d
r
i
v
e
n
 
mo
d
e
l
 
pe
r
form
 
bette
r
 
than
a
 
han
d
-craf
t
e
d
 
model?
 
 
D
o
e
s
 
a
 
clas
s
if
i
c
a
tion
 
tree
 
w
ork
 
bette
r
 
than
 
a
linea
r
 
di
s
c
rimi
n
a
n
t
 
model?
 
 
D
o
 
any
 
of
 
the
 
models
 
p
e
rfo
r
m
 
s
u
b
s
tan
t
iall
y
 
better
than
 
a
 
ba
s
elin
e
 
model?
 
I
n
 
a
g
g
r
e
g
a
t
e
:
 
h
o
w
 
w
e
l
l
 
d
o
e
s
 
e
a
c
h
 
m
o
d
e
l
 
d
o
 
 
w
h
a
t
i
s
 
it
s
 
e
x
pecte
d
 
val
u
e
?
Exp
ec
t
e
d
 
valu
e
 
for
e
v
a
l
u
a
ti
o
n
 
o
f
 
a
 
c
l
a
s
s
i
fi
e
r
Exp
ec
t
e
d
 
valu
e
 
cal
c
u
l
at
i
on
 
A
g
g
r
e
g
a
t
e
 
t
o
g
e
t
h
e
r
 
a
l
l
 
t
h
e
 
d
i
f
f
e
r
e
n
t
 
c
a
s
e
s
:
 
 
When
 
we
 
target
 
con
s
umers
,
 
wha
t
 
i
s
 
the
 
proba
b
ili
t
y
 
that
they
 
(do
 
not
)
 
res
p
on
d
?
 
 
Wh
a
t
 
about
 
w
h
e
n
 
we
 
d
o
 
n
o
t
 
target
 
c
o
n
s
umer
s
,
 
w
o
ul
d
 
they
hav
e
 
r
es
p
o
n
d
e
d?
 
T
h
i
s
 
i
n
format
i
o
n
 
i
s
 
av
a
i
l
a
b
l
e
 
i
n
 
the
 
co
n
fusi
o
n
 
matrix
 
 
Each
 
݋
𝑖
 
 
c
o
r
r
espon
d
s
 
to
 
on
e
 
o
f
 
the
 
po
s
sib
l
e
 
c
o
mbinations
o
f
 
the
 
class
 
w
e
 
predi
c
t
/
the
 
a
c
tual
 
class
 
E
x
ampl
e
 
confus
i
o
n
 
mat
ri
x
/estimate
s
 
o
f
 
probab
i
l
ity
Exp
ec
t
e
d
 
valu
e
 
for
e
v
a
l
u
a
ti
o
n
 
o
f
 
a
 
c
l
a
s
s
i
fi
e
r
 
Wh
e
re
 
d
o
 
the
 
prob
a
b
i
l
i
ties
 
o
f
 
error
s
 
an
d
 
correct
dec
i
si
o
n
s
 
actua
l
l
y
 
come
 
f
ro
m?
 
E
a
ch
 
ce
l
l
 
o
f
 
the
 
co
n
fusi
o
n
 
matrix
 
co
n
tai
n
s
 
a
 
co
u
n
t
 
of
the
 
numbe
r
 
o
f
 
dec
i
si
o
n
s
 
corr
espo
n
d
i
n
g
 
t
o
 
the
combin
a
tion
 
o
f
 
(
pred
i
cted
,
 
actua
l
)
  
𝑐𝑜𝑢𝑛𝑡(ℎ,𝑎)
C
o
mpute
 
estimate
d
 
probab
i
l
i
ties
 
as
𝑝(ℎ,𝑎)=𝑐𝑜𝑢𝑛𝑡(ℎ,𝑎)/𝑇
20
Erro
r
 
r
ates
 
C
o
m
p
u
t
e
 
c
o
s
t
-
b
e
n
e
f
i
t
 
v
a
l
u
e
s
 
f
o
r
 
e
a
c
h
 
d
e
c
i
s
i
o
n
 
p
a
i
r
 
A
 
cos
t
-
b
e
n
e
fit
 
matrix
 
specifi
e
s
 
for
 
each
(
pred
i
cted,act
ual
)
 
pa
i
r
 
the
 
cost
 
o
r
 
be
n
efi
t
 
making
such
 
a
 
dec
i
si
o
n
C
or
r
e
c
t
 
clas
s
if
i
c
a
tions
 
(true
 
po
s
it
i
ves
 
and
 
negati
v
es
)
co
r
respo
n
d
 
t
o
 
𝑏(𝑌,𝑝)
 
and
 
𝑏(𝑁,𝑛),
 
res
p
e
c
t
i
v
ely
.
Inco
r
rect
 
clas
s
if
i
c
a
tions
 
(false
 
po
s
it
i
ves
 
and
neg
a
ti
v
e
s
)
 
c
o
r
r
esp
o
n
d
 
t
o
 
𝑏(𝑌,𝑛) 
and
 
𝑏(𝑁,𝑛),
res
p
e
c
t
i
v
el
y
 
[o
f
ten
 
neg
a
ti
v
e
 
ben
e
fi
t
s
 
o
r
 
c
o
sts]
 
C
o
sts
 
an
d
 
be
n
efits
 
cann
o
t
 
b
e
 
estimate
d
 
f
ro
m
 
data
 
 
H
ow
 
much
 
i
s
 
i
t
 
re
all
y
 
w
o
r
th
 
us
 
t
o
 
retain
 
a
 
c
u
stome
r
?
 
 
Of
t
e
n
 
u
s
e
 
of
 
averag
e
 
e
s
timated
 
c
o
sts
 
and
 
ben
e
fi
t
s
Co
s
ts
 
an
d
 
b
e
n
e
fi
t
s
 
Targeted
 
marketing
 
e
x
ample
F
a
l
s
e
 
p
o
s
i
t
i
v
e
 
o
c
c
u
r
s
 
w
h
e
n
 
w
e
 
c
l
a
s
s
i
f
y
 
a
 
c
o
n
s
u
m
e
r
 
a
s
 
a
 
l
i
k
e
l
y
r
e
s
p
o
n
d
e
r
 
a
n
d
 
t
h
e
r
e
f
o
r
e
 
t
a
r
g
e
t
 
h
e
r
,
 
b
u
t
 
s
h
e
 
d
o
e
s
 
n
o
t
 
r
e
s
p
o
n
d
 
b
e
n
e
f
i
t
 
𝑏
𝑏
(
𝑌
𝑌
,
n
)
 
=
1
F
a
l
s
e
 
n
e
g
a
t
i
v
e
 
i
s
 
a
 
c
o
n
s
u
m
e
r
 
w
h
o
 
w
a
s
 
p
r
e
d
i
c
t
e
d
 
n
o
t
 
t
o
 
b
e
 
a
 
l
i
k
e
l
y
r
e
s
p
o
n
d
e
r
,
 
b
u
t
 
w
o
u
l
d
 
h
a
v
e
 
b
o
u
g
h
t
 
i
f
 
o
f
f
e
r
e
d
.
 
N
o
 
m
o
n
e
y
 
s
p
e
n
t
,
 
n
o
t
h
i
n
g
g
a
i
n
e
d
 
 
b
e
n
e
f
i
t
 
𝑏
𝑏
(
𝑁
𝑁
,
p
)
=
0
T
r
u
e
 
p
o
s
i
t
i
v
e
 
i
s
 
a
 
c
o
n
s
u
m
e
r
 
w
h
o
 
i
s
 
o
f
f
e
r
e
d
 
t
h
e
 
p
r
o
d
u
c
t
 
a
n
d
 
b
u
y
s
 
i
t
 
b
e
n
e
f
i
t
 
𝑏
𝑏
(
𝑌
𝑌
,
p
)
=
2
0
0
1
0
0
1
=
9
9
T
r
u
e
 
n
e
g
a
t
i
v
e
 
i
s
 
a
 
c
o
n
s
u
m
e
r
 
w
h
o
 
w
a
s
 
n
o
t
 
o
f
f
e
r
e
d
 
a
 
d
e
a
l
 
b
u
t
 
w
h
o
w
o
u
l
d
 
n
o
t
 
h
a
v
e
 
b
o
u
g
h
t
 
i
t
 
 
b
e
n
e
f
i
t
 
𝑏
𝑏
(
𝑁
𝑁
,
n
)
=
0
 
S
u
m
 
u
p
 
i
n
 
cost
-
be
n
efi
t
 
mat
rix
Co
s
ts
 
an
d
 
b
e
n
e
fi
t
s
 
-
e
x
am
p
le
 
C
o
m
p
u
t
e
 
e
x
p
e
c
t
e
d
 
p
r
o
f
i
t
 
b
y
 
c
e
l
l
-
w
i
s
e
 
m
u
l
t
i
p
l
i
c
a
t
i
o
n
o
f
 
t
h
e
 
m
a
t
r
i
x
 
o
f
 
c
o
s
t
s
 
a
n
d
 
b
e
n
e
f
i
t
s
 
a
g
a
i
n
s
t
 
t
h
e
 
m
a
t
r
i
x
o
f
 
p
r
o
b
a
b
i
l
i
t
i
e
s
:
 
S
u
f
f
ic
i
e
n
t
 
for
 
comp
a
rison
 
o
f
 
vari
o
u
s
 
mod
e
ls
 
A
l
t
e
r
n
a
t
i
v
e
 
c
a
l
c
u
l
a
t
i
o
n
:
 
f
a
c
t
o
r
 
o
u
t
 
t
h
e
 
p
r
o
b
a
b
i
l
i
t
i
e
s
o
f
 
se
e
i
n
g
 
e
a
ch
 
cl
a
ss
 
(
cl
a
ss
 
pri
o
rs)
 
 
Cla
s
s
 
p
r
ior
s
 
P(p) and P(n) 
s
p
e
c
if
y
 
the
 
likelih
o
o
d
 
o
f
 
s
e
eing
po
s
it
i
ve
 
vers
u
s
 
neg
a
ti
v
e
 
in
stances
 
 
Fa
c
toring
 
ou
t
 
allo
ws
 
u
s
 
to
 
s
e
pa
r
at
e
 
the
 
influen
ce
 
o
f
 
class
imbalan
c
e
 
from
 
the
 
p
r
edi
c
ti
v
e
 
po
w
e
r
 
o
f
 
the
 
model
Exp
ec
t
e
d
 
p
r
of
i
t
 
co
mputa
t
io
n
(
1/2)
 
F
a
c
t
o
r
i
n
g
 
o
u
t
 
p
r
i
o
r
s
 
y
i
e
l
d
s
 
t
h
e
 
f
o
l
l
o
w
i
n
g
 
a
l
t
e
r
n
a
t
i
v
e
e
x
p
r
e
s
s
i
o
n
 
f
o
r
 
e
x
p
e
c
t
e
d
 
p
r
o
f
i
t
 
T
h
e
 
f
i
r
s
t
 
c
o
m
p
o
n
e
n
t
 
c
o
r
r
e
s
p
o
n
d
s
 
t
o
 
t
h
e
 
e
x
p
e
c
t
e
d
p
r
o
f
i
t
 
f
r
o
m
 
t
h
e
 
p
o
s
i
t
i
v
e
 
e
x
a
m
p
l
e
s
,
 
w
h
e
r
e
a
s
 
t
h
e
s
e
c
o
n
d
 
c
o
r
r
e
s
o
n
d
s
 
t
o
 
t
h
e
 
e
x
p
e
c
t
e
d
 
p
r
o
f
i
t
 
f
r
o
m
 
t
h
e
n
e
g
a
t
i
v
e
 
e
x
a
m
p
l
e
s
Exp
ec
t
e
d
 
p
r
of
i
t
co
mputa
t
io
n
 
(
2/2)
 
Th
i
s
 
e
x
pecte
d
 
val
u
e
 
m
ean
s
 
that
 
i
f
 
w
e
 
ap
p
l
y
 
this
 
model
 
t
o
 
a
po
p
u
l
atio
n
 
o
f
 
prospectiv
e
 
customers
 
and
 
mail
 
of
fer
s
 
t
o
 
those
 
i
t
cl
a
ssifies
 
a
s
 
positive
,
 
w
e
 
can
 
e
x
p
e
ct
 
to
 
make
 
a
n
 
av
e
rage
 
o
f
 
abou
t
$
5
0.5
4
 
profit
 
pe
r
 
consumer
.
C
o
s
t
s
 
a
n
d
 
b
e
n
e
f
i
t
s
 
 
e
x
a
m
p
l
e
a
l
t
e
r
n
a
t
i
v
e
 
e
x
p
r
e
s
s
i
o
n
 
In
 
sum
:
 
instea
d
 
o
f
 
computing
 
accuraci
e
s
 
for
comp
e
ting
 
mod
e
l
s
,
 
w
e
 
w
o
u
l
d
 
comp
u
te
 
e
x
p
e
cted
val
u
es
 
We
 
can
 
compare
 
two
 
models
 
eve
n
 
though
 
on
e
 
is
bas
e
d
 
o
n
 
a
 
representative
 
d
ist
ri
b
utio
n
 
an
d
 
on
e
 
is
b
a
sed
 
o
n
 
a
 
cl
a
s
s
-
b
a
l
a
nc
e
d
 
d
a
ta
 
set
 
 
J
u
st
 
re
pla
c
e
 
the
 
p
r
iors
 
 
Balanced
 
di
stribution
 
 
P(
p)=0.5 
P(
n
)= 0.5
 
Make
 
sure
 
that
 
the
 
si
g
n
s
 
o
f
 
q
u
a
n
tities
 
i
n
 
the
cost-
be
n
efi
t
 
mat
rix
 
ar
e
 
cons
istent
 
D
o
 
no
t
 
do
u
b
l
e
 
count
 
b
y
 
puttin
g
 
a
 
be
n
efi
t
 
in
on
e
 
cell
 
an
d
 
a
 
negativ
e
 
cost
 
for
 
the
 
same
 
             
thi
n
g
 
in another 
ce
l
l 
                                                                                  
Furt
h
e
r
 
ins
i
g
hts
 
B
a
sed
 
o
n
 
the
 
entrie
s
 
o
f
 
the
 
confus
i
o
n
 
mat
ri
x,
 
we
can
 
d
e
scri
b
e
 
vari
o
u
s
 
eval
u
atio
n
 
metrics
 
 
True
 
po
s
it
i
ve
 
rate
 
(
R
e
c
all):
𝑇𝑃
𝑇
𝑃
+
𝐹𝑁
 
 
False
 
n
e
g
a
ti
v
e
 
r
a
t
e
:
𝐹𝑁
𝑇
𝑃
+
𝐹𝑁
 
 
P
re
c
isio
n
 
(
ac
c
urac
y
 
o
v
e
r
 
the
 
ca
s
e
s
 
predi
c
ted
 
t
o
 
be
po
s
it
i
ve):
𝑇𝑃
𝑇
𝑃
+
𝐹𝑃
 
 
F
-mea
s
ur
e
 
(
ha
r
monic
 
mean):
 
 
S
ensiti
v
i
t
y
:
𝑇𝑁
𝑇
𝑁
+
𝐹𝑃
𝑇𝑃
 
 
Specificit
y
:
𝑇
𝑃
+
𝐹𝑁
 
 
Ac
c
u
r
a
c
y
 
(c
o
un
t
 
of
 
c
o
r
r
ec
t
 
de
c
isio
ns
):
 
𝑇
𝑃
+
𝑇𝑁
𝑃
+
𝑁
Othe
r
 
e
v
a
l
u
a
ti
o
n
 
met
r
i
c
s
 
Meas
u
ring
 
acc
u
racy
 
 
C
o
nfusio
n
 
matrix
 
 
U
n
bala
nc
e
d
 
clas
s
es
 
A
 
key
 
an
a
lytica
l
 
f
ramewor
k
:
 
E
x
pecte
d
 
val
u
e
 
 
E
v
aluat
e
 
clas
s
if
i
e
r
 
u
s
e
 
 
Frame
 
clas
s
if
i
e
r
 
evaluation
 
E
v
a
l
u
a
t
i
o
n
 
a
n
d
 
b
a
s
e
l
i
n
e
 
p
e
r
f
o
r
m
a
n
c
e
Age
n
da
 
C
o
nsi
d
e
r
 
w
h
a
t
 
w
o
u
l
d
 
b
e
 
a
 
reasona
b
l
e
 
bas
e
l
i
ne
a
g
a
i
ns
t
 
wh
i
ch
 
to
 
comp
a
re
 
mod
e
l
 
p
e
r
f
ormance
 
 
D
emon
s
tra
t
e
 
sta
k
ehol
d
e
r
 
that
 
dat
a
 
mining
 
has
 
added
 
value
 
(or
 
not)
 
What
 
i
s
 
the
 
ap
p
ropriate
 
bas
e
l
i
n
e
 
for
 
compariso
n
?
 
 
Depend
s
 
o
n
 
the
 
actua
l
 
application
 
N
ate
 
S
i
lv
e
r
 
o
n
 
w
e
athe
r
 
for
ecastin
g:
 
Th
e
re
 
ar
e
 
t
w
o
 
b
a
sic
 
tests
 
th
a
t
 
an
y
 
w
e
a
th
e
r
for
ecast
 
mus
t
 
p
a
ss
 
to
 
d
e
m
o
nstrate
 
its
 
m
e
ri
t
:
 
(
1
)
 
It
m
u
st
 
d
o
 
be
tt
e
r
 
th
a
n
 
w
ha
t
 
m
et
e
or
o
l
o
g
ists
 
ca
l
l
p
e
rsiste
n
c
e:
 
the
 
ass
u
m
pti
o
n
 
th
a
t
 
the
 
w
e
a
th
e
r
 
w
i
l
l
b
e
 
the
 
sa
m
e
 
to
m
orro
w
 
(
an
d
 
the
 
n
ext
 
da
y)
 
a
s
 
it
w
a
s
 
to
d
ay.
 
(2)
 
It
 
mus
t
 
a
l
so
 
bea
t
 
cl
i
m
at
o
l
o
gy,
 
the
 
l
o
n
g
-term
 
h
i
storic
a
l
av
e
ra
g
e
 
o
f
 
co
n
d
i
tio
n
s
 
o
n
 
a
 
p
a
rticu
l
a
r
 
d
ate
 
i
n
 
a
 
p
a
rticu
l
a
r
 
ar
e
a.
Bas
elin
e
 
p
e
rfo
rm
a
n
c
e
 
(
1/3)
 
B
a
sel
i
n
e
 
perfo
r
mance
 
for
 
cl
a
ssific
a
tion
 
 
Compar
e
 
t
o
 
a
 
completely
 
random
 
model
 
(
v
er
y
 
easy)
 
 
I
m
plemen
t
 
a
 
simple
 
(but
 
no
t
 
simpli
s
tic)
 
alte
r
n
a
ti
v
e
 
model
 
M
a
j
o
r
i
t
y
 
c
l
a
s
s
i
f
i
e
r
 
=
 
a
 
n
a
i
v
e
 
c
l
a
s
s
i
f
i
e
r
 
t
h
a
t
 
a
l
w
a
y
s
c
h
o
o
s
e
s
 
t
h
e
 
m
a
j
o
r
i
t
y
 
c
l
a
s
s
 
o
f
 
t
h
e
 
t
r
a
i
n
i
n
g
 
d
a
t
a
 
s
e
t
 
 
May
 
be
 
c
h
allengi
ng
 
t
o
 
outpe
rfo
r
m:
 
clas
s
if
i
c
a
tion
 
a
c
c
u
racy
of
 
94%
,
 
bu
t
 
onl
y
 
6
%
 
of
 
the
 
in
stances
 
are
 
po
s
it
i
ve
 
majority
 
clas
s
if
i
e
r
 
al
so
 
w
o
ul
d
 
hav
e
 
an
 
a
c
c
u
racy
 
of
 
94%!
 
P
it
fa
l
l
:
 
do
n
‘t
 
b
e
 
surprised
 
that
 
many
 
models
 
simp
l
y
pred
ict
 
eve
r
y
thin
g
 
t
o
 
b
e
 
o
f
 
the
 
ma
j
orit
y
 
cl
a
ss
 
Ma
x
imiz
i
n
g
 
simp
l
e
 
pred
i
ction
 
accurac
y
 
is
usua
l
l
y
 
no
t
 
a
n
 
ap
p
ropriate
 
goal
Bas
elin
e
 
p
e
rfo
rm
a
n
c
e
 
(
2/3)
30
 
Further
 
a
lter
native
:
 
ho
w
 
w
e
l
l
 
do
e
s
 
a
 
simple
“con
d
iti
o
n
a
l”
 
mod
e
l
 
p
e
rform?
 
 
C
o
nditiona
l
 
 
p
r
edi
c
tion
 
dif
f
e
r
en
t
 
ba
s
e
d
 
on
 
the
 
value
 
of
the
 
fea
t
u
r
es
 
 
J
u
st
 
u
s
e
 
the
 
most
 
informati
v
e
 
variable
 
for
 
p
r
edi
c
tion
 
 
De
c
isio
n
 
tre
e
:
 
buil
d
 
a
 
tree
 
wit
h
 
onl
y
 
one
 
in
t
erna
l
 
node
(
de
c
isio
n
 
stump)
 
 
tree
 
indu
c
tion
 
s
e
le
cts
 
the
 
sin
gl
e
 
most
informati
v
e
 
fea
t
u
re
 
t
o
 
ma
k
e
 
a
 
de
c
ision
 
C
o
mpare
 
qu
a
l
ity
 
o
f
 
models
 
bas
e
d
 
o
n
 
dat
a
 
sources
 
 
Quant
i
fy
 
the
 
v
alu
e
 
of
 
ea
c
h
 
sou
r
ce
 
I
m
p
l
emen
t
 
models
 
that
 
ar
e
 
bas
e
d
 
o
n
 
doma
i
n
kno
w
l
e
dge
Bas
elin
e
 
p
e
rfo
rm
a
n
c
e
 
(
3/3)
References
Provost, F.; Fawcett, T.: Data Science for Business;
Fundamental Principles of Data Mining and Data-
Analytic Thinking. O‘Reilly, CA 95472, 2013.
Carlo Vecellis, Business Intelligence, John Wiley &
Sons, 2009
Eibe Frank, Mark A. Hall, and Ian H. Witten 
: The
Weka Workbench, M Morgan Kaufman Elsevier,
2016.
Jason Brownlee, Machine Learning Mastery With
Weka, E-Book, 2017
Slide Note
Embed
Share

Explore the importance of measuring model performance, distinguishing between good and bad outcomes, evaluating accuracy using confusion matrices, and the significance of the confusion matrix in analyzing classifier decisions.

  • Model Evaluation
  • Business Intelligence
  • Analytics
  • Confusion Matrix
  • Classifier Performance

Uploaded on Sep 23, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Business Intelligence and Analytics: What is a good model? Session 8

  2. Introduction What is desired from data mining results? How would you measure that your model is any good? How to measure performance in a meaningful way? Model evaluation is application-specific We look at common issues and themes in evaluation Frameworks and metrics for classification and instance scoring

  3. Bad positives and harmless negatives Classification terminology a bad outcome a positive example [alarm!] a good outcome a negative example [uninteresting] Further examples medical test: positive test disease is present fraud detector: positive test unusual activity on account A classifier tries to distinguish the majority of cases (negatives, the uninteresting) from the small number of alarming cases (positives, alarming) number of mistakes made on negative examples (false positive errors) will be relatively high cost of each mistake made on a positive example (false negative error) will be relatively high

  4. Agenda Measuring accuracy Confusion matrix Unbalanced classes A key analytical framework: Expected value Evaluate classifier use Frame classifier evaluation Evaluation and baseline performance

  5. Measuring accuracy and its problems Up to now: measure a model s performance by some simple metric classifier error rate, accuracy, Simple example: accuracy Classification accuracy is popular, but usually too simplistic for applications of data mining to real business problems Decompose and count the different types of correct and incorrect decisions made by a classifier

  6. The confusion matrix A confusion matrix for a problem involving n classes Is an nxn matrix with the columns labeled with actual classes and the rows labels with predicted classes Each example in a test set has an actual class label and the class predicted by the classifier The confusion matrix separates out the decisions made by the classifier actual/true classes: p(ositive), n(egative) predicted classes: Y(es), N(o) The main diagonal contains the count of correct decisions

  7. Unbalanced classes (1/3) In practical classification problems, one class is often rare Classification is used to find a relatively small number of unusual ones (defrauded customers, defective parts, targeting consumers who actually would respond, ) The class distribution is unbalanced ( skewed ) Evaluation based on accuracy does not work Example: 999:1 ratio always choose the most prevalent class 99.9% accuracy! Fraud detection: skews of 10 Is a model with 80% accuracy always better than a model with 37% accuracy? We need to know more details about the population

  8. Unbalanced classes (2/3) Consider two models A and B for the churn example (1000 customers, 1:9 ratio of churning) Both models correctly classify 80% of the balanced pop. Classifier A often falsely predicts that customers will churn Classifier B makes many opposite errors

  9. Unbalanced classes (3/3) Note the different performances of the models in form of a confusion matrix: Model A achieves 80% accuracy on the balanced sample Unbalanced population: A s accuracy is 37%, B s accuracy is 93% Which model is better? 10

  10. Unequal costs and benefits How much do we care about the different errors and correct decisions? Classification accuracy makes no distinction between false positive and false negative errors In real-world applications, different kinds of errors lead to different consequences! Examples for medical diagnosis: a patient has cancer (although he does not) false positive error, expensive, but not life threatening a patient has cancer, but she is told that she has not false negative error, more serious Errors should be counted separately Estimate cost or benefit of each decision

  11. A look beyond classification Another example: how to measure the accuracy / quality of a regression model? Predict how much a given customer will like a given movie Typical accuracy of regression: mean-squared error What does the mean-squared error describe? Value of the target variable, e.g., the number of stars that a user would give as a rating for the movie Is the mean-squared error a meaningful metric?

  12. Agenda Measuring accuracy Confusion matrix Unbalanced classes A key analytical framework: Expected value Evaluate classifier use Frame classifier evaluation Evaluation and baseline performance

  13. The expected value framework Expected value calculation includes enumeration of the possible outcomes of a situation Expected value = weighted average of the values of different possible outcomes, where the weight given to each value is the probability of its occurrence Example: different levels of profit We focus on the maximization of expected profit General form of expected value computation: Probabilities can be estimated from available data

  14. Expected value for use of a classifier (1/2) Useofaclassifier:predictaclassandtakesome action Exampletargetmarketing:assigneachconsumertoeither aclass likelyresponder or not likelyresponder Responseisusuallyrelativelylow sonoconsumermay seemlikealikelyresponder Computationoftheexpectedvalue Amodelgivesanestimatedprobabilityofresponse ??(?) foranyconsumerwithafeaturevector? Calculateexpectedbenefit(orcosts)oftargeting consumerwith ??being thevalueofaresponseand ???thevaluefromno response

  15. Expected value for use of a classifier (2/2) Example Priceofproduct:$200,costsofproduct:$100 Targetingaconsumer:$1,profit??= $99 and ???= $1 Dowemakeaprofit?Istheexpectedvalue(profit)of targetinggreaterthanzero? We should target the consumer as long as the estimated probability of responding is greater than 1%!

  16. Expected value for evaluation of a classifier Goal: compare the quality of different models with each other Does the data-driven model perform better than a hand-crafted model? Does a classification tree work better than a linear discriminant model? Do any of the models perform substantially better than a baseline model? In aggregate: how well does each model do what is its expected value?

  17. Expected value calculation

  18. Expected value for evaluation of a classifier Aggregate together all the different cases: When we target consumers, what is the probability that they (do not) respond? What about when we do not target consumers, would they have responded? This information is available in the confusion matrix Each ?corresponds to one of the possible combinations of the class we predict/the actual class Example confusion matrix/estimates of probability Actual Predicted p n Y 56 7 N 5 42

  19. Error rates Where do the probabilities of errors and correct decisions actually come from? Each cell of the confusion matrix contains a count of the number of decisions corresponding to the combination of (predicted, actual) ?????( ,?) Compute estimated probabilities as ?( ,?)=?????( ,?)/? 20

  20. Costs and benefits Compute cost-benefit values for each decision pair A cost-benefit matrix specifies for each (predicted,actual) pair the cost or benefit making such a decision Correct classifications (true positives and negatives) correspond to ?(?,?) and ?(?,?),respectively. Incorrect classifications (false positives and negatives)correspond to ?(?,?) and ?(?,?), respectively [often negativebenefits or costs] Costs and benefits cannot be estimated from data How much is it really worth us to retain a customer? Often use of average estimated costs and benefits

  21. Costs and benefits - example Targeted marketing example False positive occurs when we classify a consumer as a likely responder and therefore target her, but she does not respond benefit ?(?,n) = 1 Falsenegativeisaconsumerwhowaspredictednottobealikely responder,butwouldhaveboughtifoffered.Nomoneyspent,nothing gained benefit ?(?,p)=0 True positive is a consumer who is offered the product and buys it benefit ?(?,p)=200 100 1=99 True negative is a consumer who was not offered a deal but who would not have bought it benefit ?(?,n)=0 Sum up in cost-benefit matrix Actual Predicted P N Y 99 -1 N 0 0

  22. Expected profit computation (1/2) Compute expected profit by cell-wise multiplication of the matrix of costs and benefits against the matrix of probabilities: Sufficient for comparison of various models Alternative calculation: factor out the probabilities of seeing each class (class priors) Class priors P(p) and P(n) specify the likelihood of seeing positive versus negative instances Factoring out allows us to separate the influence of class imbalance from the predictive power of the model

  23. Expected profit computation (2/2) Factoring out priors yields the following alternative expression for expected profit The first component corresponds to the expected profit from the positive examples, whereas the second corresonds to the expected profit from the negative examples

  24. Costs and benefits example alternative expression This expected value means that if we apply this model to a population of prospective customers and mail offers to those it classifies as positive, we can expect to make an average of about $50.54 profit per consumer.

  25. Further insights In sum: instead of computing accuracies for competing models, we would compute expected values We can compare two models even though one is based on a representative distribution and one is based on a class-balanced data set Just replace the priors Balanced distribution P(p)=0.5 P(n)= 0.5 Make sure that the signs of quantities in the cost-benefit matrix are consistent Do not double count by putting a benefit in one cell and a negative cost for the same thing in another cell

  26. Other evaluation metrics Based on the entries of the confusion matrix, we can describe various evaluation metrics ?? True positive rate (Recall): ??+?? ?? False negative rate: ??+?? Precision (accuracy over the cases predicted to be positive): ??+?? ?? F-measure (harmonic mean): ?? Sensitivity: ??+?? ?? Specificity:??+?? Accuracy (count of correct decisions):??+?? ?+?

  27. Agenda Measuring accuracy Confusion matrix Unbalanced classes A key analytical framework: Expected value Evaluate classifier use Frame classifier evaluation Evaluation and baseline performance

  28. Baseline performance (1/3) Consider what would be a reasonable baseline against which to compare model performance Demonstrate stakeholder that data mining has added value (or not) What is the appropriate baseline for comparison? Depends on the actual application Nate Silver on weather forecasting: There are two basic tests that any weather forecast must pass to demonstrate its merit: (1) It must do better than what meteorologists call persistence: the assumption that the weather will be the same tomorrow (and the next day) as it was today. (2) It must also beat climatology, the long-term historical average of conditions on a particular date in a particular area.

  29. Baseline performance (2/3) Baseline performance for classification Compare to a completely random model (very easy) Implement a simple (but not simplistic) alternative model Majority classifier = a naive classifier that always chooses the majority class of the training data set May be challenging to outperform: classification accuracy of 94%, but only 6% of the instances are positive majority classifier also would have an accuracy of 94%! Pitfall: don t be surprised that many models simply predict everything to be of the majority class Maximizing simple prediction accuracy is usually not an appropriate goal 30

  30. Baseline performance (3/3) Further alternative: how well does a simple conditional model perform? Conditional prediction different based on the value of the features Just use the most informative variable for prediction Decision tree: build a tree with only one internal node (decision stump) tree induction selects the single most informative feature to make a decision Compare quality of models based on data sources Quantify the value of each source Implement models that are based on domain knowledge

  31. References Provost, F.; Fawcett, T.: Data Science for Business; Fundamental Principles of Data Mining and Data- Analytic Thinking. O Reilly, CA 95472, 2013. Carlo Vecellis, Business Intelligence, John Wiley & Sons, 2009 Eibe Frank, Mark A. Hall, and Ian H. Witten : The Weka Workbench, M Morgan Kaufman Elsevier, 2016. Jason Brownlee, Machine Learning Mastery With Weka, E-Book, 2017

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#