Verb Disambiguation through Collocation Patterns

T
h
e
 
D
V
C
 
p
r
o
j
e
c
t
:
 
D
i
s
a
m
b
i
g
u
a
t
i
o
n
o
f
 
V
e
r
b
s
 
b
y
 
C
o
l
l
o
c
a
t
i
o
n
_
_
_
_
a
n
 
i
n
t
r
o
d
u
c
t
i
o
n
 
t
o
 
t
h
e
 
l
i
n
g
u
i
s
t
i
c
 
t
h
e
o
r
y
 
o
f
n
o
r
m
s
 
a
n
d
 
e
x
p
l
o
i
t
a
t
i
o
n
s
Patrick Hanks
Research Institute of Information and Language Processing,
University of Wolverhampton
patrick.w.hanks@gmail.com
 
1
W
o
r
d
s
 
a
r
e
 
v
e
r
y
 
a
m
b
i
g
u
o
u
s
;
d
i
c
t
i
o
n
a
r
i
e
s
 
a
r
e
 
m
i
s
l
e
a
d
i
n
g
In any dictionary, more than one sense is usually given for each
word.
Often, many senses.
For example, in 
MWALED (Merriam Websters’ Advanced Learner’s English
Dictionary) 
the verb 
blow
 has 12 senses, plus 6 subsenses, plus 7 phrasal verbs
(each with between 1 and 6 senses), plus 15 idiomatic phrases.
The noun is even more complicated.
Dictionaries do not tell the user (a learner or a programmer) how
to distinguish one sense of a word from another.
WSD (word sense disambiguation) projects in NLP, using
dictionaries, have failed, according to leaders in the field (e.g.
Ide and Wilks 2006).
2
P
h
r
a
s
e
o
l
o
g
i
c
a
l
 
p
a
t
t
e
r
n
s
 
o
f
 
w
o
r
d
 
u
s
e
Most utterances consists of words used in familiar patterns, e.g.:
The wind was 
blowing
 from the east;
the wind 
blew
 the napkin off the table;
the referee 
blew
 his whistle for the end of the match;
he 
blew
 his nose.
They 
blew up 
the bridge;
the bridge 
blew up
.
These are examples of phraseological ‘norms’ associated with 
blow
.
Unconsciously, ordinary language users repeat the same 
norms
(patterns) over and over again, with minor variations in the
various slots in the patterns.
e.g. ‘
east
’ alternates with ‘
west
’, ‘
north
’, ‘
south
’, etc.
3
P
a
t
t
e
r
n
s
 
a
r
e
 
u
n
a
m
b
i
g
u
o
u
s
Unlike words, patterns are unambiguous.
He blew up a bridge
’ and ‘
He blew up a balloon
’ have quite
distinct, unambiguous meanings
even though the words 
blow
, 
bridge
, and 
balloon
 can all be ambiguous
when taken in isolation, out of context.
The verb is the pivot of the clause.
Each verb is associated with one or more stereotypical phraseological
patterns.
For NLP and language teaching alike, there is a great need for a
dictionary or inventory of normal phraseological patterns.
A pattern is a statistical probability, not a cut-and-dried certainty.
The aim must be to inventorize all normal usage, not all possible
usage.
4
N
o
r
m
s
 
a
n
d
 
e
x
p
l
o
i
t
a
t
i
o
n
s
The DVC project at RIILP is developing a method (Corpus
Pattern Analysis) for identifying and building an inventory of
prototypical phraseological norms.
www.pdev.org.uk
Each pattern consists of a 
syntagmatic structure 
plus 
lexical
sets 
of 
collocations.
Understanding meaning depends on matching the wording of an
actual utterance with a pattern.
Best match wins!
Speakers and writers sometimes 
exploit
 norms in various ways,
for example to create new metaphors.
The DVC project is also studying the rules governing
exploitations of phraseological norms.
5
Slide Note
Embed
Share

Verbs in natural language are highly ambiguous, posing challenges for word sense disambiguation projects. This article introduces the role of phraseological norms and exploitations in distinguishing between different senses of verbs by analyzing collocation patterns. Through Corpus Pattern Analysis, the DVC project aims to create an inventory of typical phraseological norms to aid in language processing and teaching. By recognizing and leveraging these patterns, both speakers and writers can effectively convey nuanced meanings and create new metaphors.

  • Verb Disambiguation
  • Collocation Patterns
  • Phraseological Norms
  • Language Processing
  • Corpus Analysis

Uploaded on Sep 21, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. The DVC project: Disambiguation of Verbs by Collocation ____ an introduction to the linguistic theory of norms and exploitations Patrick Hanks Research Institute of Information and Language Processing, University of Wolverhampton patrick.w.hanks@gmail.com 1

  2. Words are very ambiguous; dictionaries are misleading In any dictionary, more than one sense is usually given for each word. Often, many senses. For example, in MWALED (Merriam Websters Advanced Learner s English Dictionary) the verb blow has 12 senses, plus 6 subsenses, plus 7 phrasal verbs (each with between 1 and 6 senses), plus 15 idiomatic phrases. The noun is even more complicated. Dictionaries do not tell the user (a learner or a programmer) how to distinguish one sense of a word from another. WSD (word sense disambiguation) projects in NLP, using dictionaries, have failed, according to leaders in the field (e.g. Ide and Wilks 2006). 2

  3. Phraseological patterns of word use Most utterances consists of words used in familiar patterns, e.g.: The wind was blowing from the east; the wind blew the napkin off the table; the referee blew his whistle for the end of the match; he blew his nose. They blew up the bridge; the bridge blew up. These are examples of phraseological norms associated with blow. Unconsciously, ordinary language users repeat the same norms (patterns) over and over again, with minor variations in the various slots in the patterns. e.g. east alternates with west , north , south , etc. 3

  4. Patterns are unambiguous Unlike words, patterns are unambiguous. He blew up a bridge and He blew up a balloon have quite distinct, unambiguous meanings even though the words blow, bridge, and balloon can all be ambiguous when taken in isolation, out of context. The verb is the pivot of the clause. Each verb is associated with one or more stereotypical phraseological patterns. For NLP and language teaching alike, there is a great need for a dictionary or inventory of normal phraseological patterns. A pattern is a statistical probability, not a cut-and-dried certainty. The aim must be to inventorize all normal usage, not all possible usage. 4

  5. Norms and exploitations The DVC project at RIILP is developing a method (Corpus Pattern Analysis) for identifying and building an inventory of prototypical phraseological norms. www.pdev.org.uk Each pattern consists of a syntagmatic structure plus lexical sets of collocations. Understanding meaning depends on matching the wording of an actual utterance with a pattern. Best match wins! Speakers and writers sometimes exploit norms in various ways, for example to create new metaphors. The DVC project is also studying the rules governing exploitations of phraseological norms. 5

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#