Distant Supervision for Knowledge Base Population: Definition, Training, Results, and Challenges

S
u
p
p
o
r
t
 
t
o
o
l
s
 
f
o
r
 
t
h
e
 
V
Q
R
 
I
t
a
l
i
a
n
 
R
e
s
e
a
r
c
h
A
s
s
e
s
s
m
e
n
t
 
E
x
e
r
c
i
s
e
:
 
t
h
e
 
S
a
p
i
e
n
z
a
E
x
p
e
r
i
e
n
c
e
Camil Demetrescu, 
Marco Schaerf
Dept. Computer, Control and Management
Engineering
E
u
r
o
C
r
i
s
 
M
e
m
b
e
r
s
h
i
p
 
m
e
e
t
i
n
g
B
o
n
n
 
M
a
y
 
1
4
,
 
2
0
1
3
O
u
t
l
i
n
e
The Italian VQR research assessment exercise
The Sapienza experience
Results
Conclusions
26/02/2025
Task Force VQR
Page 2
T
h
e
 
V
Q
R
 
N
a
t
i
o
n
a
l
 
R
e
s
e
a
r
c
h
 
A
s
s
e
s
s
m
e
n
t
 
E
x
e
r
c
i
s
e
In 2012, public Italian universities and research
centers have participated in a major research
assessment exercise (VQR)
G
o
a
l
:
i
n
f
o
r
m
 
s
e
l
e
c
t
i
v
e
 
f
u
n
d
i
n
g
 
a
l
l
o
c
a
t
i
o
n
C
o
v
e
r
a
g
e
:
r
e
s
e
a
r
c
h
 
p
r
o
d
u
c
t
s
 
p
u
b
l
i
s
h
e
d
 
i
n
 
2
0
0
4
-
2
0
1
0
E
v
a
l
u
a
t
i
o
n
:
m
i
x
 
o
f
 
p
e
e
r
-
r
e
v
i
e
w
 
a
n
d
 
b
i
b
l
i
o
m
e
t
r
i
c
s
M
a
i
n
 
c
h
a
l
l
e
n
g
e
 
f
o
r
 
u
n
i
v
e
r
s
i
t
i
e
s
 
&
 
R
C
:
c
h
o
o
s
i
n
g
 
a
 
s
e
l
e
c
t
i
o
n
 
o
f
 
t
h
e
 
b
e
s
t
 
p
r
o
d
u
c
t
s
 
t
o
 
s
u
b
m
i
t
26/02/2025
Task Force VQR
Page 3
V
Q
R
 
i
n
 
a
 
n
u
t
s
h
e
l
l
 
(
1
/
2
)
 
E
a
c
h
 
r
e
s
e
a
r
c
h
e
r
/
f
a
c
u
l
t
y
 
m
e
m
b
e
r
 
s
u
b
m
i
t
t
e
d
 
u
p
 
t
o
 
3
 
o
f
h
e
r
/
h
i
s
 
b
e
s
t
 
p
r
o
d
u
c
t
s
 
p
u
b
l
i
s
h
e
d
 
i
n
 
2
0
0
4
-
2
0
1
0
N
o
 
d
u
p
l
i
c
a
t
e
 
s
u
b
m
i
s
s
i
o
n
s
:
 
e
a
c
h
 
p
r
o
d
u
c
t
 
s
e
l
e
c
t
e
d
 
b
y
 
a
t
m
o
s
t
 
o
n
e
 
c
o
a
u
t
h
o
r
 
o
f
 
t
h
e
 
s
a
m
e
 
i
n
s
t
i
t
u
t
i
o
n
E
v
a
l
u
a
t
i
o
n
 
d
o
n
e
 
b
y
 
1
4
 
p
a
n
e
l
s
 
(
G
E
V
)
Different evaluation criteria for each panel
Reference databases:
T
h
o
m
s
o
n
 
R
e
u
t
e
r
s
 
W
e
b
 
o
f
 
S
c
i
e
n
c
e
 
(
W
o
S
)
 
[
m
a
i
n
]
E
l
s
e
v
i
e
r
 
S
c
o
p
u
s
 
[
a
d
d
i
t
i
o
n
a
l
]
F
o
r
 
e
a
c
h
 
s
u
b
m
i
t
t
e
d
 
p
r
o
d
u
c
t
,
 
i
n
s
t
i
t
u
t
i
o
n
s
 
h
a
d
 
t
o
 
c
h
o
o
s
e
:
a specific evaluation panel to evaluate the product
a subject category
26/02/2025
Task Force VQR
Page 4
V
Q
R
 
i
n
 
a
 
n
u
t
s
h
e
l
l
 
(
2
/
2
)
 
Mandatory pieces of information to submit:
Meta-data (title, authors, etc.)
Full text (pdf)
Abstract
ISSN (journals)
ISBN (other publications)
O
u
t
c
o
m
e
 
o
f
 
t
h
e
 
e
v
a
l
u
a
t
i
o
n
:
 
a
 
n
u
m
e
r
i
c
 
s
c
o
r
e
 
f
o
r
 
e
a
c
h
s
u
b
m
i
t
t
e
d
 
p
r
o
d
u
c
t
T
o
t
a
l
 
s
c
o
r
e
 
o
f
 
t
h
e
 
i
n
s
t
i
t
u
t
i
o
n
 
=
 
s
u
m
 
o
f
 
s
c
o
r
e
s
 
o
f
s
u
b
m
i
t
t
e
d
 
p
r
o
d
u
c
t
s
 
(
w
i
l
l
 
d
e
t
e
r
m
i
n
e
 
p
a
r
t
 
o
f
 
t
h
e
 
f
u
n
d
i
n
g
a
l
l
o
c
a
t
i
o
n
 
f
o
r
 
n
e
x
t
 
y
e
a
r
s
)
26/02/2025
Task Force VQR
Page 5
V
Q
R
 
g
r
a
d
e
s
 
a
n
d
 
s
c
o
r
e
s
26/02/2025
Task Force VQR
Page 6
P
r
o
d
u
c
t
s
 
e
l
i
g
i
b
l
e
 
f
o
r
 
e
v
a
l
u
a
t
i
o
n
A
r
t
i
c
l
e
s
 
i
n
 
j
o
u
r
n
a
l
s
 
w
i
t
h
 
I
S
S
N
B
o
o
k
s
,
 
b
o
o
k
 
c
h
a
p
t
e
r
s
,
 
a
n
d
 
c
o
n
f
e
r
e
n
c
e
p
r
o
c
e
e
d
i
n
g
s
 
p
a
p
e
r
s
 
w
i
t
h
 
I
S
B
N
Critical editions, translations, scientific comments
Deposited patents
Compositions, drawings, design, performance,
exhibits and organised exposures, artifacts,
prototypes and artworks and their projects, databases
and software, and thematic maps (provided that they
are supported by accompanying publications)
26/02/2025
Task Force VQR
Page 7
V
Q
R
 
e
v
a
l
u
a
t
i
o
n
 
p
a
n
e
l
s
 
(
G
E
V
)
 
f
o
r
 
S
u
b
j
e
c
t
 
A
r
e
a
s
26/02/2025
Task Force VQR
Page 8
E
v
a
l
u
a
t
i
o
n
 
c
r
i
t
e
r
i
a
 
Hard sciences:
Subjects defined using WoS/Scopus or explicitely through
lists of area-specific journal rankings (A, B, C, D)
Citations
Impact factor or Scopus SJR
Informed peer-review (IR) and Peer review for non-journal
articles
Soft sciences: peer review
Countless details:
Different evaluation for survey articles
Different thresholds for different panels, etc. etc.
26/02/2025
Task Force VQR
Page 9
26/02/2025
Task Force VQR
Page 10
Citations grade
2004-2008
E
x
a
m
p
l
e
:
 
G
E
V
 
0
3
 
(
C
h
e
m
i
s
t
r
y
)
2009-2010
Impact
 Factor / SJR grade
Citations grade
 
E
x
a
m
p
l
e
:
 
a
r
t
i
c
l
e
 
p
u
b
l
i
s
h
e
d
 
i
n
 
2
0
0
5
,
 
w
i
t
h
:
Citations grade A (top 20%)
Impact Factor grade C (top 50%)
Impact
 Factor / SJR grade
26/02/2025
Task Force VQR
Page 11
Citations grade
2004-2008
E
x
a
m
p
l
e
:
 
G
E
V
 
0
3
 
(
C
h
e
m
i
s
t
r
y
)
2009-2010
Impact
 Factor / SJR grade
Citations grade
E
x
a
m
p
l
e
:
 
a
r
t
i
c
l
e
 
p
u
b
l
i
s
h
e
d
 
i
n
 
2
0
1
0
,
 
w
i
t
h
:
Citations grade A (top 20%)
Impact Factor grade C (top 50%)
Impact
 Factor / SJR grade
V
Q
R
 
T
i
m
e
l
i
n
e
 
November 7, 2011:
call for participation published
February 29, 2012:
(incomplete) evaluation criteria published
June 15, 2012:
product submission deadline for institutions
 
S
e
l
e
c
t
i
o
n
 
p
r
o
c
e
s
s
 
f
o
r
 
i
n
s
t
i
t
u
t
i
o
n
s
:
3
 
m
o
n
t
h
s
 
(
+
 
2
 
w
e
e
k
s
 
l
a
s
t
-
m
i
n
u
t
e
 
e
x
t
e
n
s
i
o
n
)
26/02/2025
Task Force VQR
Page 12
O
u
t
l
i
n
e
The Italian VQR research assessment exercise
T
h
e
 
S
a
p
i
e
n
z
a
 
e
x
p
e
r
i
e
n
c
e
Results
Conclusions
26/02/2025
Task Force VQR
Page 13
S
a
p
i
e
n
z
a
 
i
n
 
a
 
n
u
t
s
h
e
l
l
One of the largest universities in Europe
129,500 students in 2010, 1
st
 in Europe, 43
rd
 in the
world as number of students
One of the oldest in Italy, founded in the 14
th
 century
Over 4,000 researchers from 63 departments
21 museums and more than 50 libraries
Research catalog including 250,000 publications
~75,000 considered for the VQR
26/02/2025
Task Force VQR
Page 14
S
e
l
e
c
t
i
o
n
 
a
p
p
r
o
a
c
h
 
T
o
p
-
d
o
w
n
:
 
c
e
n
t
r
a
l
 
c
o
o
r
d
i
n
a
t
i
o
n
 
f
o
r
 
a
l
l
 
d
e
p
a
r
m
e
n
t
s
 
b
a
s
e
d
o
n
 
a
 
s
o
f
w
a
r
e
 
s
y
s
t
e
m
 
e
s
p
e
c
i
a
l
l
y
 
d
e
s
i
g
n
e
d
 
f
o
r
 
t
h
e
 
V
Q
R
G
o
a
l
:
 
u
s
e
 
o
p
t
i
m
i
z
a
t
i
o
n
 
a
l
g
o
r
i
t
h
m
s
 
t
o
 
m
a
x
i
m
i
z
e
 
t
h
e
e
x
p
e
c
t
e
d
 
t
o
t
a
l
 
s
c
o
r
e
 
o
f
 
S
a
p
i
e
n
z
a
S
a
m
e
 
p
r
o
d
u
c
t
 
m
a
y
 
h
a
v
e
 
d
i
f
f
e
r
e
n
t
 
s
c
o
r
e
s
 
d
e
p
e
n
d
i
n
g
o
n
:
P
a
n
e
l
 
t
o
 
w
h
i
c
h
 
t
h
e
 
p
r
o
d
u
c
t
 
i
s
 
s
u
b
m
i
t
t
e
d
S
u
b
j
e
c
t
 
c
a
t
e
g
o
r
y
 
i
n
 
w
h
i
c
h
 
t
h
e
 
p
r
o
d
u
c
t
 
i
s
 
c
l
a
s
s
i
f
i
e
d
O
u
r
 
s
o
f
t
w
a
r
e
 
s
i
m
u
l
a
t
e
d
 
a
l
l
 
p
o
s
s
i
b
l
e
 
p
a
n
e
l
/
s
u
b
j
e
c
t
c
a
t
e
g
o
r
y
 
c
o
m
b
i
n
a
t
i
o
n
s
,
 
c
o
m
p
u
t
i
n
g
 
t
h
e
 
e
x
p
e
c
t
e
d
 
s
c
o
r
e
H
u
m
a
n
 
v
a
l
i
d
a
t
i
o
n
 
s
e
l
e
c
t
e
d
 
r
e
a
s
o
n
a
b
l
e
 
c
o
m
b
i
n
a
t
i
o
n
s
26/02/2025
Task Force VQR
Page 15
E
x
a
m
p
l
e
:
 
j
o
u
r
n
a
l
 
a
r
t
i
c
l
e
 
i
n
 
p
h
y
s
i
c
s
26/02/2025
Task Force VQR
Page 16
E
x
p
e
c
t
.
C
h
o
s
e
n
C
h
o
s
e
n
g
r
a
d
e
p
a
n
e
l
R
e
l
e
v
.
s
u
b
j
e
c
t
 
c
a
t
e
g
o
r
y
D
a
t
a
b
a
s
e
C
 
02
 
3
 
METEOROLOGY & ATMOSPHERIC SCIENCES
 
WoS
B
 
02
 
3
 
ATMOSPHERIC SCIENCE
   
Scopus
C
 
03
 
0
 
METEOROLOGY & ATMOSPHERIC SCIENCES
 
WoS
B
 
03
 
0
 
ATMOSPHERIC SCIENCE
   
Scopus
C
 
04
 
0
 
METEOROLOGY & ATMOSPHERIC SCIENCES
 
WoS
B
 
04
 
0
 
ATMOSPHERIC SCIENCE
   
Scopus
C
 
07
 
0
 
METEOROLOGY & ATMOSPHERIC SCIENCES
 
WoS
A
 
07
 
0
 
ATMOSPHERIC SCIENCE
   
WoS
C
 
08
 
0
 
METEOROLOGY & ATMOSPHERIC SCIENCES
 
ISI
A
 
08
 
0
 
ATMOSPHERIC SCIENCE
   
WoS
C
 
11
 
0
 
METEOROLOGY & ATMOSPHERIC SCIENCES
 
ISI
B
 
11
 
0
 
ATMOSPHERIC SCIENCE
   
WoS
C
 
09
 
0
 
METEOROLOGY & ATMOSPHERIC SCIENCES
 
ISI
A
 
09
 
0
 
ATMOSPHERIC SCIENCE
   
WoS
S
u
r
v
i
v
i
n
g
 
b
i
g
 
d
a
t
a
 
P
r
o
b
l
e
m
:
 
m
a
n
u
a
l
l
y
 
c
h
o
o
s
i
n
g
 
t
h
e
 
b
e
s
t
 
r
e
a
s
o
n
a
b
l
e
p
a
n
e
l
/
s
u
b
j
e
c
t
 
c
a
t
e
g
o
r
y
 
c
o
m
b
i
n
a
t
i
o
n
 
f
o
r
 
a
l
l
 
e
l
i
g
i
b
l
e
p
r
o
d
u
c
t
s
 
w
o
u
l
d
 
h
a
v
e
 
b
e
e
n
 
o
v
e
r
w
h
e
l
m
i
n
g
!
O
u
r
 
s
o
l
u
t
i
o
n
 
(
f
o
r
 
h
a
r
d
 
s
c
i
e
n
c
e
s
)
:
1.
I
n
i
t
i
a
l
 
a
u
t
o
m
a
t
i
c
 
c
h
o
i
c
e
 
o
f
 
t
e
n
t
a
t
i
v
e
 
p
a
n
e
l
/
s
u
b
j
e
c
t
c
a
t
e
g
o
r
y
 
t
o
 
e
a
c
h
 
j
o
u
r
n
a
l
 
a
r
t
i
c
l
e
 
b
a
s
e
d
 
o
n
 
a
 
m
a
x
i
m
u
m
r
e
l
e
v
a
n
c
e
 
m
e
t
r
i
c
 
w
e
 
d
e
s
i
g
n
e
d
=
>
 
y
i
e
l
d
s
 
i
n
i
t
i
a
l
 
t
e
n
t
a
t
i
v
e
 
g
r
a
d
e
 
f
o
r
 
e
a
c
h
 
a
r
t
i
c
l
e
2.
Automatic selection of the best 3 and 6 products for each
author based on the tentative grades
3.
M
a
n
u
a
l
 
v
a
l
i
d
a
t
i
o
n
 
o
f
 
s
e
l
e
c
t
e
d
 
p
r
o
d
u
c
t
s
 
o
n
l
y
4.
Optimization algorithm re-executed every night
26/02/2025
Task Force VQR
Page 17
V
Q
R
s
e
l
e
c
t
 
W
e
b
 
i
n
t
e
r
f
a
c
e
26/02/2025
Task Force VQR
Page 18
Best 3
products
per author
Best 6
products
per author
All eligible
products
Excluded
products
Faculty
members
and products
assigned to
them
M
a
n
u
a
l
 
v
a
l
i
d
a
t
i
o
n
 
o
f
 
p
a
n
e
l
/
s
u
b
j
e
c
t
 
c
a
t
e
g
o
r
y
 
c
o
m
b
i
n
a
t
i
o
n
26/02/2025
Task Force VQR
Page 19
O
p
t
i
m
i
z
a
t
i
o
n
 
a
l
g
o
r
i
t
h
m
26/02/2025
Task Force VQR
Page 20
A
u
t
h
o
r
 
1
A
u
t
h
o
r
 
2
+
1
+
0
.
8
0
+
0
.
8
+
0
.
5
0
+
1
O
p
t
i
m
i
z
a
t
i
o
n
 
a
l
g
o
r
i
t
h
m
26/02/2025
Task Force VQR
Page 21
A
u
t
h
o
r
 
1
A
u
t
h
o
r
 
2
+
1
+
0
.
8
0
+
0
.
8
+
0
.
5
0
+
1
O
p
t
i
m
i
z
a
t
i
o
n
 
a
l
g
o
r
i
t
h
m
26/02/2025
Task Force VQR
Page 22
A
u
t
h
o
r
 
1
A
u
t
h
o
r
 
2
+
1
+
0
.
8
0
+
0
.
8
+
0
.
5
0
+
1
C
r
i
t
i
c
a
l
 
a
s
p
e
c
t
s
 
(
1
/
2
)
 
Extremely tight time frame for selecting the research
products
Large-scale coordination: 63 departments
(Incomplete) evaluation criteria known 3.5 months
until the submission deadline
Different evaluation criteria for different panels
Critical data not publicly available (e.g., thresholds for
determining if a product is in the top 20% ecc.)
Strange/wrong choices
GEV09 with different criteria, GEV01 Applied Math problem ..
26/02/2025
Task Force VQR
Page 23
C
r
i
t
i
c
a
l
 
a
s
p
e
c
t
s
 
(
2
/
2
)
 
Extensive data quality problems in our research
catalog:
Duplicates
Wrong classification (e.g., proceedings as Article)
Missing or wrong fields
Missing coauthors
Missing or wrong codes (DOI, PUBMED, ISBN, ….)
Data quality problems also in WoS and Scopus (e.g.,
incorrect subject categories)
26/02/2025
Task Force VQR
Page 24
S
a
p
i
e
n
z
a
 
t
i
m
e
l
i
n
e
 
(
3
.
5
 
m
o
n
t
h
s
)
P
h
a
s
e
 
0
:
M
a
r
c
h
 
1
 
 
A
p
r
i
l
 
1
1
[
 
4
2
 
d
a
y
s
 
]
W
h
a
t
:
V
Q
R
s
e
l
e
c
t
 
s
o
f
t
w
a
r
e
 
d
e
v
e
l
o
p
m
e
n
t
Who:
 
 Sapienza publications group + Exaltech Srl
P
h
a
s
e
 
1
:
 
A
p
r
i
l
 
1
2
 
 
M
a
y
 
6
 
[
 
2
5
 
d
a
y
s
]
W
h
a
t
:
 
p
r
o
d
u
c
t
 
s
e
l
e
c
t
i
o
n
Who: 
 
 department heads
P
h
a
s
e
 
2
:
 
M
a
y
 
7
 
 
J
u
n
e
 
2
2
 
[
 
1
6
 
d
a
y
s
 
]
W
h
a
t
:
 
a
d
d
i
t
i
o
n
a
l
 
i
n
f
o
,
 
u
p
l
o
a
d
 
o
f
 
P
D
F
s
Who: 
 
 faculty members, department heads
P
h
a
s
e
 
3
/
4
:
 
M
a
y
 
2
3
 
 
J
u
n
e
 
1
5
[
 
2
4
 
d
a
y
s
 
]
W
h
a
t
:
 
l
i
n
k
i
n
g
 
w
i
t
h
 
W
o
S
/
S
c
o
p
u
s
,
 
e
r
r
o
r
 
c
o
r
r
e
c
t
i
o
n
s
Who: 
 
 VQR task force
26/02/2025
Task Force VQR
Page 25
T
i
m
e
l
i
n
e
 
o
f
 
p
r
o
d
u
c
t
 
s
e
l
e
c
t
i
o
n
26/02/2025
Task Force VQR
Page 26
P
h
a
s
e
 
1
P
h
a
s
e
 
2
P
h
a
s
e
 
3
P
h
a
s
e
 
4
-
1
6
6
-
1
9
-
4
2
-
4
9
O
u
t
l
i
n
e
The Italian VQR research assessment exercise
The Sapienza experience
R
e
s
u
l
t
s
Conclusions
26/02/2025
Task Force VQR
Page 27
S
e
l
e
c
t
e
d
 
p
r
o
d
u
c
t
s
 
o
v
e
r
 
9
2
%
 
o
f
 
e
x
p
e
c
t
e
d
26/02/2025
Task Force VQR
Page 28
S
e
l
e
c
t
e
d
 
p
r
o
d
u
c
t
s
:
 
s
o
f
t
 
v
s
.
 
h
a
r
d
 
s
c
i
e
n
c
e
s
26/02/2025
Task Force VQR
Page 29
%
 
s
e
l
e
c
t
e
d
 
p
r
o
d
u
c
t
s
 
b
y
 
t
y
p
e
26/02/2025
Task Force VQR
Page 30
E
s
t
i
m
a
t
e
d
 
s
c
o
r
e
s
 
f
o
r
 
s
u
b
m
i
t
t
e
d
 
j
o
u
r
n
a
l
 
a
r
t
i
c
l
e
s
(
h
a
r
d
 
s
c
i
e
n
c
e
s
)
26/02/2025
Task Force VQR
Page 31
C
o
n
c
l
u
s
i
o
n
s
 
Sheer size of Sapienza, large number of products,
data quality issues, incomplete evaluation criteria, and
short time frame made the process extremely critical
Top-down approach, using IT methods
Optimization algorithms used to maximize the
expected score of Sapienza
IT insfrastructure was crucial for the success of the
process
Role of IT for research assessment will increase in the
future
26/02/2025
Task Force VQR
Page 32
F
u
t
u
r
e
 
W
o
r
k
Transform the system into a day-by-day research
assesment system
Modify our datamodel (the final data model was a
mess, from 4 to 101 tables) to make it CERIF-
compliant (working on it)
Allow for data-quality verification and more complex
analysis
Better integration with other systems
26/02/2025
Task Force VQR
Pagina 33
26/02/2025
Task Force VQR
Pagina 34
Pagina 34
Thanks
Slide Note
Embed
Share

In Distant Supervision for Knowledge Base Population, the approach involves generating training data automatically from Wikipedia infoboxes. The training evaluation process includes mapping infobox fields to KBP slots, finding relevant sentences, extracting slot candidates, and training multiclass classifiers. Challenges include improving data quality, IR recall, and using relation-specific trigger words for boosting relevant sentences automatically.

  • Knowledge Base Population
  • Distant Supervision
  • Training Evaluation
  • Data Quality
  • Information Retrieval

Uploaded on Feb 26, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Support tools for the VQR Italian Research Assessment Exercise: the Sapienza Experience EuroCris Membership meeting Bonn May 14, 2013 Camil Demetrescu, Marco Schaerf Dept. Computer, Control and Management Engineering

  2. Outline The Italian VQR research assessment exercise The Sapienza experience Results Conclusions Task Force VQR 26/02/2025 Page 2

  3. The VQR National Research Assessment Exercise In 2012, public Italian universities and research centers have participated in a major research assessment exercise (VQR) Goal: inform selective funding allocation Coverage: research products published in 2004-2010 Evaluation: mix of peer-review and bibliometrics Main challenge for universities & RC: choosing a selection of the best products to submit Task Force VQR 26/02/2025 Page 3

  4. VQR in a nutshell (1/2) Each researcher/faculty member submitted up to 3 of her/his best products published in 2004-2010 No duplicate submissions: each product selected by at most one coauthor of the same institution Evaluation done by 14 panels (GEV) Different evaluation criteria for each panel Reference databases: Thomson Reuters Web of Science (WoS) [main] Elsevier Scopus [additional] For each submitted product, institutions had to choose: a specific evaluation panel to evaluate the product a subject category Task Force VQR 26/02/2025 Page 4

  5. VQR in a nutshell (2/2) Mandatory pieces of information to submit: Meta-data (title, authors, etc.) Full text (pdf) Abstract ISSN (journals) ISBN (other publications) Outcome of the evaluation: a numeric score for each submitted product Total score of the institution = sum of scores of submitted products (will determine part of the funding allocation for next years) Task Force VQR 26/02/2025 Page 5

  6. VQR grades and scores Task Force VQR 26/02/2025 Page 6

  7. Products eligible for evaluation Articles in journals with ISSN Books, bookchapters, and conference proceedings papers with ISBN Critical editions, translations, scientific comments Deposited patents Compositions, drawings, design, performance, exhibits and organised exposures, artifacts, prototypes and artworks and their projects, databases and software, and thematic maps (provided that they are supported by accompanying publications) Task Force VQR 26/02/2025 Page 7

  8. VQR evaluation panels (GEV) for Subject Areas Task Force VQR 26/02/2025 Page 8

  9. Evaluation criteria Hard sciences: Subjects defined using WoS/Scopus or explicitely through lists of area-specific journal rankings (A, B, C, D) Citations Impact factor or Scopus SJR Informed peer-review (IR) and Peer review for non-journal articles Soft sciences: peer review Countless details: Different evaluation for survey articles Different thresholds for different panels, etc. etc. Task Force VQR 26/02/2025 Page 9

  10. Example: GEV 03 (Chemistry) 2004-2008 2009-2010 Impact Factor / SJR grade A B Impact Factor / SJR grade A B C D C D A A A IR A IR IR IR A A Citations grade Citations grade B B B IR A B C D B B IR C C C A B C D C C IR D D D IR IR IR D D D Example: article published in 2005, with: Citations grade A (top 20%) Impact Factor grade C (top 50%) Overall grade: A Score: +1 Task Force VQR 26/02/2025 Page 10

  11. Example: GEV 03 (Chemistry) 2004-2008 2009-2010 Impact Factor / SJR grade A B Impact Factor / SJR grade A B C D C D A A A IR A IR IR IR A A Citations grade Citations grade B B B IR A B C D B B IR C C C A B C D C C IR D D D IR IR IR D D D Example: article published in 2010, with: Citations grade A (top 20%) Impact Factor grade C (top 50%) Informed peer-review Task Force VQR 26/02/2025 Page 11

  12. VQR Timeline November 7, 2011: call for participation published February 29, 2012: (incomplete) evaluation criteria published June 15, 2012: product submission deadline for institutions Selection process for institutions: 3 months (+ 2 weeks last-minute extension) Task Force VQR 26/02/2025 Page 12

  13. Outline The Italian VQR research assessment exercise The Sapienza experience Results Conclusions Task Force VQR 26/02/2025 Page 13

  14. Sapienza in a nutshell One of the largest universities in Europe 129,500 students in 2010, 1st in Europe, 43rd in the world as number of students One of the oldest in Italy, founded in the 14th century Over 4,000 researchers from 63 departments 21 museums and more than 50 libraries Research catalog including 250,000 publications ~75,000 considered for the VQR Task Force VQR 26/02/2025 Page 14

  15. Selection approach Top-down: central coordination for all deparments based on a sofware system especially designed for the VQR Goal: use optimization algorithms to maximize the expected total score of Sapienza Same product may have different scores depending on: Panel to which the product is submitted Subject category in which the product is classified Our software simulated all possible panel/subject category combinations, computing the expected score Human validation selected reasonable combinations Task Force VQR 26/02/2025 Page 15

  16. Example: journal article in physics Expect. Chosen grade panel C 02 B 02 C 03 B 03 C 04 B 04 C 07 A 07 C 08 A 08 C 11 B 11 C 09 A 09 Chosen subject category METEOROLOGY & ATMOSPHERIC SCIENCES WoS ATMOSPHERIC SCIENCE METEOROLOGY & ATMOSPHERIC SCIENCES WoS ATMOSPHERIC SCIENCE METEOROLOGY & ATMOSPHERIC SCIENCES WoS ATMOSPHERIC SCIENCE METEOROLOGY & ATMOSPHERIC SCIENCES WoS ATMOSPHERIC SCIENCE METEOROLOGY & ATMOSPHERIC SCIENCES ISI ATMOSPHERIC SCIENCE METEOROLOGY & ATMOSPHERIC SCIENCES ISI ATMOSPHERIC SCIENCE METEOROLOGY & ATMOSPHERIC SCIENCES ISI ATMOSPHERIC SCIENCE Relev. 3 3 0 0 0 0 0 0 0 0 0 0 0 0 Database Reasonable choice Scopus Scopus Physics Scopus Maximization choice WoS WoS Agricultural and Veterinary Sciences WoS WoS Task Force VQR 26/02/2025 Page 16

  17. Surviving big data Problem: manually choosing the best reasonable panel/subject category combination for all eligible products would have been overwhelming! Our solution (for hard sciences): 1. Initial automatic choice of tentative panel/subject category to each journal article based on a maximum relevance metric we designed => yields initial tentative grade for each article 2. Automatic selection of the best 3 and 6 products for each author based on the tentative grades 3. Manual validation of selected products only 4. Optimization algorithm re-executed every night Task Force VQR 26/02/2025 Page 17

  18. Faculty members and products assigned to them VQRselect Web interface Best 3 products per author Best 6 products per author All eligible products Excluded products Task Force VQR 26/02/2025 Page 18

  19. Manual validation of panel/subject category combination Task Force VQR 26/02/2025 Page 19

  20. Optimization algorithm +1 Product 1 Author 1 Slot 1 +0.8 Product 2 Slot 2 0 Product 3 Slot 3 +0.8 Product 4 0 Product 5 Author 2 Slot 1 +0.5 Product 6 Slot 2 +1 Product 7 Task Force VQR 26/02/2025 Page 20

  21. Optimization algorithm +1 Product 1 Author 1 Slot 1 +0.8 Product 2 Slot 2 0 Product 3 Slot 3 +0.8 Product 4 0 Product 5 Author 2 Slot 1 +0.5 Product 6 Slot 2 +1 Product 7 Task Force VQR 26/02/2025 Page 21

  22. Optimization algorithm +1 Product 1 Author 1 Slot 1 +0.8 Product 2 Slot 2 0 Product 3 Slot 3 +0.8 Product 4 0 Product 5 Author 2 Slot 1 +0.5 Product 6 Slot 2 +1 Product 7 Task Force VQR 26/02/2025 Page 22

  23. Critical aspects (1/2) Extremely tight time frame for selecting the research products Large-scale coordination: 63 departments (Incomplete) evaluation criteria known 3.5 months until the submission deadline Different evaluation criteria for different panels Critical data not publicly available (e.g., thresholds for determining if a product is in the top 20% ecc.) Strange/wrong choices GEV09 with different criteria, GEV01 Applied Math problem .. Task Force VQR 26/02/2025 Page 23

  24. Critical aspects (2/2) Extensive data quality problems in our research catalog: Duplicates Wrong classification (e.g., proceedings as Article) Missing or wrong fields Missing coauthors Missing or wrong codes (DOI, PUBMED, ISBN, .) Data quality problems also in WoS and Scopus (e.g., incorrect subject categories) Task Force VQR 26/02/2025 Page 24

  25. Sapienza timeline (3.5 months) Phase 0: March 1 April 11 What: VQRselect software development Who: Sapienza publications group + Exaltech Srl Phase 1: April 12 May 6 What: product selection Who: department heads Phase 2: May 7 June 22 What: additional info, upload of PDFs Who: faculty members, department heads Phase 3/4: May 23 June 15 What: linking with WoS/Scopus, error corrections Who: VQR task force [ 42 days ] [ 25 days] [ 16 days ] [ 24 days ] Task Force VQR 26/02/2025 Page 25

  26. Timeline of product selection 10350 10300 Number of selected products 10250 10200 -166 10150 10100 -19 -42 10050 -49 10000 9950 9900 9850 April 12 May 7 May 23 June 1 June 15 Task Force VQR 26/02/2025 Page 26

  27. Outline The Italian VQR research assessment exercise The Sapienza experience Results Conclusions Task Force VQR 26/02/2025 Page 27

  28. Selected products over 92% of expected Missing (823) 7.6% Selected (10019) 92.4% Task Force VQR 26/02/2025 Page 28

  29. Selected products: soft vs. hard sciences Soft sciences 44% Hard sciences 56% Task Force VQR 26/02/2025 Page 29

  30. % selected products by type Journal article 73.93% 12.66% Book chapter 8.35% Monograph 4.65% Conference Proceedings 0.44% Curatorship 0.05% Patent 0.02% Other Task Force VQR 26/02/2025 Page 30

  31. Estimated scores for submitted journal articles (hard sciences) D 11% C/D 4% C 7% B/C 4% A 55% B 17% A/B 2% Task Force VQR 26/02/2025 Page 31

  32. Conclusions Sheer size of Sapienza, large number of products, data quality issues, incomplete evaluation criteria, and short time frame made the process extremely critical Top-down approach, using IT methods Optimization algorithms used to maximize the expected score of Sapienza IT insfrastructure was crucial for the success of the process Role of IT for research assessment will increase in the future Task Force VQR 26/02/2025 Page 32

  33. Future Work Transform the system into a day-by-day research assesment system Modify our datamodel (the final data model was a mess, from 4 to 101 tables) to make it CERIF- compliant (working on it) Allow for data-quality verification and more complex analysis Better integration with other systems Task Force VQR 26/02/2025 Pagina 33

  34. Thanks Task Force VQR 26/02/2025 Pagina 34 Pagina 34

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#