Dialog Acts in Spoken Language Processing

 
RASwDA: Re-Aligned Switchboard
Dialog Act Corpus for Dialog Act
Prediction in Conversations
 
C
O
M
S
 
6
9
9
8
:
 
A
d
v
a
n
c
e
d
 
T
o
p
i
c
s
 
i
n
 
S
p
o
k
e
n
 
L
a
n
g
u
a
g
e
 
P
r
o
c
e
s
s
i
n
g
 
(
S
p
r
i
n
g
 
2
0
2
4
)
W
e
e
k
 
8
:
 
3
/
5
 
-
 
S
p
o
k
e
n
 
D
i
a
l
o
g
u
e
 
S
y
s
t
e
m
s
 
P
r
o
j
e
c
t
 
M
e
m
b
e
r
s
:
 
R
u
n
 
C
h
e
n
,
 
E
l
e
a
n
o
r
 
L
i
n
,
 
S
h
a
y
a
n
 
H
o
o
s
h
m
a
n
d
,
 
M
a
r
i
a
m
 
M
u
s
t
a
f
a
,
 
R
o
s
e
 
S
l
o
a
n
,
 
R
i
t
i
k
a
N
a
n
d
i
,
 
A
l
i
c
i
a
 
Y
a
n
g
,
 
A
n
d
r
e
a
 
L
o
p
e
z
,
 
A
n
s
h
 
K
o
t
h
a
r
y
,
 
I
s
a
a
c
 
S
u
h
,
 
C
a
t
h
e
r
i
n
e
 
L
y
u
,
 
E
r
i
c
 
C
h
e
n
,
 
S
o
p
h
i
a
 
H
o
r
n
g
 
a
n
d
J
u
l
i
a
 
H
i
r
s
c
h
b
e
r
g
 
Outline
 
1.
What are dialog acts (and why do we care)?
2.
Switchboard Dialog Act Corpus
3.
Re-Aligned Switchboard Dialog Act Corpus
 
What are dialog acts?
 
dialog act = communicative function + semantic content
T
h
r
o
u
g
h
 
d
i
a
l
o
g
,
 
s
p
e
a
k
e
r
s
 
i
n
f
l
u
e
n
c
e
 
(
=
a
c
t
 
o
n
)
 
e
a
c
h
 
o
t
h
e
r
'
s
 
c
o
g
n
i
t
i
v
e
s
t
a
t
e
s
 
a
n
d
 
t
h
e
i
r
 
s
u
r
r
o
u
n
d
i
n
g
 
e
n
v
i
r
o
n
m
e
n
t
 
(
=
c
o
n
t
e
x
t
)
Many possible realizations of the same dialog act
"Could you open the window?"
"Please open the window."
etc.
 
Popescu-Belis, A. (2007). Dialogue Acts: One or More Dimensions? 
ISSCO WorkingPaper
, 
62
, 1-46.
 
What are dialog acts?
 
Many possible realizations of the same dialog act
Many possible interpretations of same utterance as different dialog acts
"
a
n
 
u
t
t
e
r
a
n
c
e
 
o
f
 
[
'
I
'
l
l
 
b
e
 
t
h
e
r
e
 
b
e
f
o
r
e
 
y
o
u
'
]
 
c
a
n
 
b
e
 
t
a
k
e
n
 
u
n
d
e
r
 
a
p
p
r
o
p
r
i
a
t
e
c
o
n
d
i
t
i
o
n
s
 
e
.
g
.
,
 
a
s
 
a
 
p
r
o
m
i
s
e
,
 
a
 
p
r
e
d
i
c
t
i
o
n
,
 
a
 
w
a
r
n
i
n
g
,
 
o
r
 
a
 
r
e
m
a
r
k
 
o
n
 
t
h
e
s
p
e
a
k
e
r
'
s
 
a
n
d
 
t
h
e
 
a
d
d
r
e
s
s
e
e
'
s
 
d
i
s
p
o
s
i
t
i
o
n
s
 
.
 
.
 
.
 
i
n
 
e
a
c
h
 
o
f
 
t
h
e
s
e
 
c
a
s
e
s
 
a
d
i
f
f
e
r
e
n
t
 
s
p
e
e
c
h
 
a
c
t
 
h
a
s
 
b
e
e
n
 
p
e
r
f
o
r
m
e
d
.
"
 
(
S
e
a
r
l
e
 
e
t
 
a
l
.
,
 
S
p
e
e
c
h
 
A
c
t
 
T
h
e
o
r
y
a
n
d
 
P
r
a
g
m
a
t
i
c
s
,
 
1
9
8
0
,
 
p
.
 
1
)
promise > "I'll be there before you" (I promise I'll be on time)
prediction > "I'll be there before you" (I think I will arrive first)
etc.
 
Dialog Acts for Spoken Dialogue Systems
 
Useful abstraction for dialog systems when generating language
Learn when it is appropriate to generate different dialog acts
Use dialog act type as input when generating utterances
Dialog systems also need to accurately recognize dialog acts (
"I'll be there
before you": promise, prediction, warning, or remark?)
Dialog acts and prosody
Example: Expressing uncertainty (left) versus incredulity (right)
 
Chen, H., Liu, X., Yin, D., & Tang, J. (2017). A survey on dialogue systems: Recent advances and new frontiers. 
Acm Sigkdd Explorations
Newsletter
, 
19
(2), 25-35.
 
Hirschberg, J. (2016). Pragmatics and Prosody.
 
Switchboard Dialog Act Corpus
 
Extends
 1990's Switchboard Corpus annotat
ed with dialog acts
Designed for computational DA modeling, conversational speech recognition
1,155 telephone conversations
Typical length: 5-10 minutes
2 speakers per conversation
Americans from various regions (different dialects)
Topics: various (e.g., cars, criminal justice system, women's fashion,
childcare, cooking, books, movies, air pollution)
Transcripts manually segmented into utterances and annotated with dialog
acts
Also provides audio recordings of conversations
 
Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., ... & Meteer, M. (2000). Dialogue act modeling for automatic tagging and
recognition of conversational speech. 
Computational linguistics
, 
26
(3), 339-373.
 
Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., ... & Meteer, M. (2000). Dialogue act modeling for automatic tagging and
recognition of conversational speech. 
Computational linguistics
, 
26
(3), 339-373.
opinion statement:
falling intonation pattern broken to place emphasis on "better"
yes-no question:
rising intonation
laughter: short,
breathy bursts
greeting ("conventional-opening"):
rising intonation (uncertainty?)
 
DA and Prosody in Switchboard Corpus
non-opinion statement:
falling intonation
 
Dialog Act Classification on SwDA
T
h
i
s
 
m
o
d
e
l
 
u
s
e
d
 
o
n
l
y
 
a
u
d
i
o
i
n
p
u
t
.
2022
Note: DSTC3 corpus has ~1/2 of the number of labels in SwDA
 
Inaccuracies in Switchboard DA Corpus
 
Audio data not effectively leveraged due to misalignment with text
Inaccurate utterance boundaries affect prosodic analysis of DA's
utterance duration
pitch range compression/final lowering at end of turn/topic
boundary tones at ends of phrases
Original automatic alignment skipped inarticulate/quiet words
MSU Switchboard: better alignments and transcriptions but also
problems linking new transcripts to DA labels
 
Deshmukh, N., Ganapathiraju, A., Gleeson, A., Hamaker, J., & Picone, J. (1998). Resegmentation of SWITCHBOARD. In 
ICSLP
.
Hirschberg, J. (2017). Pragmatics and prosody. 
The Oxford handbook of pragmatics
, 532-549.
Shriberg, E., Stolcke, A., Jurafsky, D., Coccaro, N., Meteer, M., Bates, R., Taylor, P., Ries, K., Martin, R., & van Ess-Dykema, C. (1998). Can prosody aid the
automatic classification of dialog acts in conversational speech? 
Language and Speech
, 
41
(3–4), 443–492.
https://doi.org/10.1177/002383099804100410
 
Calhoun, S., Carletta, J., Brenier, J. M., Mayo, N., Jurafsky, D., Steedman, M., & Beaver, D. (2010). The NXT-format Switchboard Corpus: a rich
resource for investigating the syntax, semantics, pragmatics and prosody of dialogue. 
Language resources and evaluation
, 
44
, 387-419.
 
NXT Switchboard:
642/1155 SwDA
conversations
 
NXT Switchboard
 
Re-Aligned Switchboard Dialog Act Corpus
 
aeneas forced alignments
(github.com/readbeyond/aeneas)
 
Manual
correction
 
Aligned TextGrid
 
SwDA
transcripts
+
SwDA
audio
 
Realigned Data Improves DAC Accuracy
 
Conclusion and Future Work
 
Dialog systems need to accurately predict and produce dialog acts
The Switchboard Dialog Act corpus is a valuable resource for dialog modeling
but has inaccuracies in its automatic alignments
Our Re-Aligned Switchboard Dialog Act (RASwDA) corpus has improved
performance on DA classification from speech
We plan to continue to release the full RASwDA corpus to the wider speech
community
 
Questions?
 
Please post on EdStem
Slide Note
Embed
Share

Dialog acts encompass the communicative function and semantic content in conversations, influencing cognitive states and context. They have multiple realizations and interpretations, impacting dialog systems' language generation and recognition processes. Dialog acts play a crucial role in spoken dialogue systems by guiding appropriate responses based on input types. Prosody in dialog acts indicates nuances like uncertainty and incredulity.

  • Dialog Acts
  • Spoken Language
  • Cognitive States
  • Language Processing
  • Prosody

Uploaded on Jul 05, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. RASwDA: Re-Aligned Switchboard Dialog Act Corpus for Dialog Act Prediction in Conversations COMS 6998: Advanced Topics in Spoken Language Processing (Spring 2024) Week 8: 3/5 - Spoken Dialogue Systems Project Members: Run Chen, Eleanor Lin, Shayan Hooshmand, Mariam Mustafa, Rose Sloan, Ritika Nandi, Alicia Yang, Andrea Lopez, Ansh Kothary, Isaac Suh, Catherine Lyu, Eric Chen, Sophia Horng and Julia Hirschberg

  2. Outline 1. What are dialog acts (and why do we care)? 2. Switchboard Dialog Act Corpus 3. Re-Aligned Switchboard Dialog Act Corpus

  3. What are dialog acts? dialog act = communicative function + semantic content Through dialog, speakers influence (=act on) each other's cognitive states and their surrounding environment (=context) Many possible realizations of the same dialog act "Could you open the window?" "Please open the window." etc. Popescu-Belis, A. (2007). Dialogue Acts: One or More Dimensions? ISSCO WorkingPaper, 62, 1-46.

  4. What are dialog acts? Many possible realizations of the same dialog act Many possible interpretations of same utterance as different dialog acts "an utterance of ['I'll be there before you'] can be taken under appropriate conditions e.g., as a promise, a prediction, a warning, or a remark on the speaker's and the addressee's dispositions . . . in each of these cases a different speech act has been performed." (Searle et al., Speech Act Theory and Pragmatics, 1980, p. 1) promise > "I'll be there before you" (I promise I'll be on time) prediction > "I'll be there before you" (I think I will arrive first) etc.

  5. Dialog Acts for Spoken Dialogue Systems Useful abstraction for dialog systems when generating language Learn when it is appropriate to generate different dialog acts Use dialog act type as input when generating utterances Dialog systems also need to accurately recognize dialog acts ("I'll be there before you": promise, prediction, warning, or remark?) Dialog acts and prosody Example: Expressing uncertainty (left) versus incredulity (right) Chen, H., Liu, X., Yin, D., & Tang, J. (2017). A survey on dialogue systems: Recent advances and new frontiers. Acm Sigkdd Explorations Newsletter, 19(2), 25-35. Hirschberg, J. (2016). Pragmatics and Prosody.

  6. Switchboard Dialog Act Corpus Extends 1990's Switchboard Corpus annotated with dialog acts Designed for computational DA modeling, conversational speech recognition 1,155 telephone conversations Typical length: 5-10 minutes 2 speakers per conversation Americans from various regions (different dialects) Topics: various (e.g., cars, criminal justice system, women's fashion, childcare, cooking, books, movies, air pollution) Transcripts manually segmented into utterances and annotated with dialog acts Also provides audio recordings of conversations Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., ... & Meteer, M. (2000). Dialogue act modeling for automatic tagging and recognition of conversational speech. Computational linguistics, 26(3), 339-373.

  7. Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., ... & Meteer, M. (2000). Dialogue act modeling for automatic tagging and recognition of conversational speech. Computational linguistics, 26(3), 339-373.

  8. DA and Prosody in Switchboard Corpus non_opinion_statement yes_no_question yes-no question: rising intonation non-opinion statement: falling intonation uncertainty greeting ("conventional-opening"): rising intonation (uncertainty?) contrast_opinion opinion statement: falling intonation pattern broken to place emphasis on "better" laughter laughter: short, breathy bursts

  9. Dialog Act Classification on SwDA 2022 This model used only audio input. Note: DSTC3 corpus has ~1/2 of the number of labels in SwDA

  10. Inaccuracies in Switchboard DA Corpus Audio data not effectively leveraged due to misalignment with text Inaccurate utterance boundaries affect prosodic analysis of DA's utterance duration pitch range compression/final lowering at end of turn/topic boundary tones at ends of phrases Original automatic alignment skipped inarticulate/quiet words MSU Switchboard: better alignments and transcriptions but also problems linking new transcripts to DA labels Deshmukh, N., Ganapathiraju, A., Gleeson, A., Hamaker, J., & Picone, J. (1998). Resegmentation of SWITCHBOARD. In ICSLP. Hirschberg, J. (2017). Pragmatics and prosody. The Oxford handbook of pragmatics, 532-549. Shriberg, E., Stolcke, A., Jurafsky, D., Coccaro, N., Meteer, M., Bates, R., Taylor, P., Ries, K., Martin, R., & van Ess-Dykema, C. (1998). Can prosody aid the automatic classification of dialog acts in conversational speech? Language and Speech, 41(3 4), 443 492. https://doi.org/10.1177/002383099804100410

  11. NXT Switchboard: 642/1155 SwDA conversations Calhoun, S., Carletta, J., Brenier, J. M., Mayo, N., Jurafsky, D., Steedman, M., & Beaver, D. (2010). The NXT-format Switchboard Corpus: a rich resource for investigating the syntax, semantics, pragmatics and prosody of dialogue. Language resources and evaluation, 44, 387-419.

  12. Re-Aligned Switchboard Dialog Act Corpus Manual correction NXT Switchboard SwDA transcripts + SwDA audio Aligned TextGrid aeneas forced alignments (github.com/readbeyond/aeneas)

  13. Realigned Data Improves DAC Accuracy

  14. Conclusion and Future Work Dialog systems need to accurately predict and produce dialog acts The Switchboard Dialog Act corpus is a valuable resource for dialog modeling but has inaccuracies in its automatic alignments Our Re-Aligned Switchboard Dialog Act (RASwDA) corpus has improved performance on DA classification from speech We plan to continue to release the full RASwDA corpus to the wider speech community

  15. Questions? Please post on EdStem

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#