Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13319))

Included in the following conference series:

  • 1203 Accesses

Abstract

During an interaction, interactants exchange speaking turns. Exchanges can be done smoothly or through interruptions. Listeners can display backchannels, send signals to grab the speaking turn, wait for the speaker to yield the turn, or even interrupt and grab the speaking turn. Interruptions are very frequent in natural interactions. To create believable and engaging interaction between human interactants and embodied conversational agent ECA, it is important to endow virtual agent with the capability to manage interruptions, that is to have the ability to interrupt, but also to react to an interruption. As a first step, we focus on the later one where the agent is able to perceive and interpret the user’s multimodal behaviors as either an attempt or not to take the turn. To this aim, we annotate, analyse and characterize interruptions in human-human conversations. In this paper, we describe our annotation schema that embeds different types of interruptions. We then provide an analysis of multimodal features, focusing of prosodic features (F0 and loudness) and body (head and hand) activity, to characterize interruptions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Allwood, J., Nivre, J., Ahlsén, E.: On the semantics and pragmatics of linguistic feedback. J. Semant. 9(1), 1–26 (1992)

    Article  Google Scholar 

  2. Ball, P.: Listeners’ responses to filled pauses in relation to floor apportionment. Br. J. Soc. Clin. Psychol. (1975)

    Google Scholar 

  3. Baltrusaitis, T., Zadeh, A., Lim, Y.C., Morency, L.P.: OpenFace 2.0: facial behavior analysis toolkit. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 59–66. IEEE (2018)

    Google Scholar 

  4. Baur, T., et al.: explainable cooperative machine learning with NOVA. KI - Künstliche Intelligenz (2020)

    Google Scholar 

  5. Beattie, G.W.: Floor apportionment and gaze in conversational dyads. Br. J. Soc. Clin. Psychol. 17(1), 7–15 (1978)

    Article  Google Scholar 

  6. Beattie, G.W.: Interruption in Conversational Interaction, and Its Relation to the Sex and Status of the Interactants. Walter de Gruyter, Berlin/New York (1981)

    Book  Google Scholar 

  7. Bögels, S., Torreira, F.: Turn-end estimation in conversational turn-taking: the roles of context and prosody. Discour. Process. 58(10), 903–924 (2021)

    Article  Google Scholar 

  8. Cafaro, A., Glas, N., Pelachaud, C.: The effects of interrupting behavior on interpersonal attitude and engagement in dyadic interactions. In: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, pp. 911–920 (2016)

    Google Scholar 

  9. Cafaro, A., et al.: The NoXi database: multimodal recordings of mediated novice-expert interactions. In: Proceedings of the 19th ACM International Conference on Multimodal Interaction, pp. 350–359 (2017)

    Google Scholar 

  10. Chowdhury, S.A., Danieli, M., Riccardi, G.: Annotating and categorizing competition in overlap speech. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5316–5320. IEEE (2015)

    Google Scholar 

  11. Chýlek, A., Švec, J., Šmídl, L.: Learning to interrupt the user at the right time in incremental dialogue systems. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2018. LNCS (LNAI), vol. 11107, pp. 500–508. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00794-2_54

    Chapter  Google Scholar 

  12. Coates, J.: 11 no gap, lots of overlap: turn-taking patterns in. Researching language and literacy in social context: a reader, p. 177 (1994)

    Google Scholar 

  13. Coman, A.C., Yoshino, K., Murase, Y., Nakamura, S., Riccardi, G.: An incremental turn-taking model for task-oriented dialog systems. arXiv preprint arXiv:1905.11806 (2019)

  14. De Kok, I., Heylen, D.: Multimodal end-of-turn prediction in multi-party meetings. In: Proceedings of the 2009 International Conference on Multimodal Interfaces, pp. 91–98 (2009)

    Google Scholar 

  15. De Ruiter, J.P., Mitterer, H., Enfield, N.J.: Projecting the end of a speaker’s turn: a cognitive cornerstone of conversation. Language 82(3), 515–535 (2006)

    Article  Google Scholar 

  16. Dediu, D., Levinson, S.C.: On the antiquity of language: the reinterpretation of Neandertal linguistic capacities and its consequences. Front. Psychol. 4, 397 (2013)

    Article  Google Scholar 

  17. Demol, M., Verhelst, W., Verhoeve, P.: The duration of speech pauses in a multilingual environment. In: Eighth Annual Conference of the International Speech Communication Association (2007)

    Google Scholar 

  18. Duncan, S.: Some signals and rules for taking speaking turns in conversations. J. Pers. Soc. Psychol. 23(2), 283 (1972)

    Article  Google Scholar 

  19. Egorow, O., Wendemuth, A.: On emotions as features for speech overlaps classification. IEEE Trans. Affect. Comput. (2019)

    Google Scholar 

  20. Ekman, P., Friesen, W.V.: Facial action coding system. Environ. Psychol. Nonverbal Behav. (1978)

    Google Scholar 

  21. Eyben, F., Wöllmer, M., Schuller, B.: OpenSmile: the Munich versatile and fast open-source audio feature extractor. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1459–1462 (2010)

    Google Scholar 

  22. Ferguson, N.: Simultaneous speech, interruptions and dominance. Br. J. Soc. Clin. Psychol. 16(4), 295–302 (1977)

    Article  Google Scholar 

  23. French, P., Local, J.: Turn-competitive incomings. J. Pragmat. 7(1), 17–38 (1983)

    Article  Google Scholar 

  24. Goldberg, J.A.: Interrupting the discourse on interruptions: an analysis in terms of relationally neutral, power-and rapport-oriented acts. J. Pragmat. 14(6), 883–903 (1990)

    Article  Google Scholar 

  25. Gravano, A., Brusco, P., Benus, S.: Who do you think will speak next? Perception of turn-taking cues in Slovak and argentine Spanish. In: INTERSPEECH, pp. 1265–1269 (2016)

    Google Scholar 

  26. Gravano, A., Hirschberg, J.: A corpus-based study of interruptions in spoken dialogue. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)

    Google Scholar 

  27. Hammarberg, B., Fritzell, B., Gaufin, J., Sundberg, J., Wedin, L.: Perceptual and acoustic correlates of abnormal voice qualities. Acta Otolaryngol. 90(1–6), 441–451 (1980)

    Article  Google Scholar 

  28. Hara, K., Inoue, K., Takanashi, K., Kawahara, T.: Turn-taking prediction based on detection of transition relevance place. In: Proceedings of Interspeech 2019, pp. 4170–4174 (2019). https://doi.org/10.21437/Interspeech.2019-1537

  29. Heldner, M., Edlund, J.: Pauses, gaps and overlaps in conversations. J. Phon. 38(4), 555–568 (2010)

    Article  Google Scholar 

  30. Holler, J., Kendrick, K.H., Casillas, M., Levinson, S.C.: Turn-taking in human communicative interaction. Front. Media SA (2016)

    Google Scholar 

  31. Indefrey, P., Levelt, W.J.: The spatial and temporal signatures of word production components. Cognition 92(1–2), 101–144 (2004)

    Article  Google Scholar 

  32. Ishii, R., Otsuka, K., Kumano, S., Matsuda, M., Yamato, J.: Predicting next speaker and timing from gaze transition patterns in multi-party meetings. In: Proceedings of the 15th ACM on International conference on multimodal interaction, pp. 79–86 (2013)

    Google Scholar 

  33. Ishii, R., Otsuka, K., Kumano, S., Yamato, J.: Using respiration to predict who will speak next and when in multiparty meetings. ACM Trans. Interact. Intell. Syst. (TiiS) 6(2), 1–20 (2016)

    Article  Google Scholar 

  34. Ishii, R., Ren, X., Muszynski, M., Morency, L.P.: Can prediction of turn-management willingness improve turn-changing modeling? In: Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, pp. 1–8 (2020)

    Google Scholar 

  35. Ishii, R., Ren, X., Muszynski, M., Morency, L.P.: Multimodal and multitask approach to listener’s backchannel prediction: can prediction of turn-changing and turn-management willingness improve backchannel modeling? In: Proceedings of the 21st ACM International Conference on Intelligent Virtual Agents, pp. 131–138 (2021)

    Google Scholar 

  36. Ishimoto, Y., Teraoka, T., Enomoto, M.: End-of-utterance prediction by prosodic features and phrase-dependency structure in spontaneous Japanese speech. In: Interspeech, pp. 1681–1685 (2017)

    Google Scholar 

  37. Itakura, H.: Describing conversational dominance. J. Pragmat. 33(12), 1859–1880 (2001)

    Article  Google Scholar 

  38. Kendon, A.: Some functions of gaze-direction in social interaction. Acta Physiol. 26, 22–63 (1967)

    Google Scholar 

  39. Kurtić, E., Brown, G.J., Wells, B.: Resources for turn competition in overlapping talk. Speech Commun. 55(5), 721–743 (2013)

    Article  Google Scholar 

  40. Lee, C.C., Lee, S., Narayanan, S.S.: An analysis of multimodal cues of interruption in dyadic spoken interactions. In: Ninth Annual Conference of the International Speech Communication Association (2008)

    Google Scholar 

  41. Lee, C.C., Narayanan, S.: Predicting interruptions in dyadic spoken interactions. In: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5250–5253. IEEE (2010)

    Google Scholar 

  42. Maier, A., Hough, J., Schlangen, D., et al.: Towards deep end-of-turn prediction for situated spoken dialogue systems (2017)

    Google Scholar 

  43. Moerman, M., Sacks, H.: Appendix B. on “understanding” in the analysis of natural conversation. In: Talking Culture, pp. 180–186. University of Pennsylvania Press (2010)

    Google Scholar 

  44. Niebuhr, O., Görs, K., Graupe, E.: Speech reduction, intensity, and F0 shape are cues to turn-taking. In: Proceedings of the SIGDIAL 2013 Conference, pp. 261–269 (2013)

    Google Scholar 

  45. Riest, C., Jorschick, A.B., de Ruiter, J.P.: Anticipation in turn-taking: mechanisms and information sources. Front. Psychol. 6, 89 (2015)

    Article  Google Scholar 

  46. Sacks, H., Schegloff, E.A., Jefferson, G.: A simplest systematics for the organization of turn taking for conversation. In: Studies in the Organization of Conversational Interaction, pp. 7–55. Elsevier (1978)

    Google Scholar 

  47. Schegloff, E.A.: Sequencing in conversational openings 1. Am. Anthropol. 70(6), 1075–1095 (1968)

    Article  Google Scholar 

  48. Schegloff, E.A.: Overlapping talk and the organization of turn-taking for conversation. Lang. Soc. 29(1), 1–63 (2000)

    Article  Google Scholar 

  49. Schegloff, E.A., Sacks, H.: Opening up Closings. Walter de Gruyter, Berlin/New York (1973)

    Book  Google Scholar 

  50. Shriberg, E., Stolcke, A., Baron, D.: Observations on overlap: findings and implications for automatic processing of multi-party conversation. In: Seventh European Conference on Speech Communication and Technology (2001)

    Google Scholar 

  51. Skantze, G., Johansson, M., Beskow, J.: Exploring turn-taking cues in multi-party human-robot discussions about objects. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 67–74 (2015)

    Google Scholar 

  52. Stivers, T., et al.: Universals and cultural variation in turn-taking in conversation. Proc. Natl. Acad. Sci. 106(26), 10587–10592 (2009)

    Article  Google Scholar 

  53. Tannen, D., et al.: You Just Don’t Understand: Women and Men in Conversation. Virago, London (1991)

    Google Scholar 

  54. Truong, K.P.: Classification of cooperative and competitive overlaps in speech using cues from the context, overlapper, and overlappee. In: Interspeech, pp. 1404–1408 (2013)

    Google Scholar 

  55. Van Berkum, J.J., Brown, C.M., Zwitserlood, P., Kooijman, V., Hagoort, P.: Anticipating upcoming words in discourse: evidence from ERPs and reading times. J. Exp. Psychol. Learn. Mem. Cogn. 31(3), 443 (2005)

    Article  Google Scholar 

  56. Xiu, Y., Li, J., Wang, H., Fang, Y., Lu, C.: Pose flow: efficient online pose tracking. In: BMVC (2018)

    Google Scholar 

  57. Yang, L.C.: Visualizing spoken discourse: prosodic form and discourse functions of interruptions. In: Proceedings of the Second SIGdial Workshop on Discourse and Dialogue (2001)

    Google Scholar 

Download references

Acknowledgements

This work was performed as part of ANR-JST-CREST TAPAS and ANR-JST-DFG PANORAMA project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liu Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, L., Achard, C., Pelachaud, C. (2022). Multimodal Analysis of Interruptions. In: Duffy, V.G. (eds) Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Anthropometry, Human Behavior, and Communication. HCII 2022. Lecture Notes in Computer Science, vol 13319. Springer, Cham. https://doi.org/10.1007/978-3-031-05890-5_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-05890-5_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-05889-9

  • Online ISBN: 978-3-031-05890-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics