How to Look Next? A Data-Driven Approach for Scanpath Prediction

Boccignone, Giuseppe; Cuculo, Vittorio; D’Amelio, Alessandro

doi:10.1007/978-3-030-54994-7_10

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 12232))

Included in the following conference series:

International Symposium on Formal Methods

720 Accesses
2 Citations

Abstract

By and large, current visual attention models mostly rely, when considering static stimuli, on the following procedure. Given an image, a saliency map is computed, which, in turn, might serve the purpose of predicting a sequence of gaze shifts, namely a scanpath instantiating the dynamics of visual attention deployment. The temporal pattern of attention unfolding is thus confined to the scanpath generation stage, whilst salience is conceived as a static map, at best conflating a number of factors (bottom-up information, top-down, spatial biases, etc.).

In this note we propose a novel sequential scheme that consists of a three-stage processing relying on a center-bias model, a context/layout model, and an object-based model, respectively. Each stage contributes, at different times, to the sequential sampling of the final scanpath. We compare the method against classic scanpath generation that exploits state-of-the-art static saliency model. Results show that accounting for the structure of the temporal unfolding leads to gaze dynamics close to human gaze behaviour.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Code available at https://github.com/phuselab/CLE.
2.
An implementation is provided at https://github.com/phuselab/RQAscanpath.

References

Anderson, N.C., Anderson, F., Kingstone, A., Bischof, W.F.: A comparison of scanpath comparison methods. Behav. Res. Methods 47(4), 1377–1392 (2014). https://doi.org/10.3758/s13428-014-0550-3
Article Google Scholar
Anderson, N.C., Bischof, W.F., Laidlaw, K.E.W., Risko, E.F., Kingstone, A.: Recurrence quantification analysis of eye movements. Behav. Res. Methods 45(3), 842–856 (2013). https://doi.org/10.3758/s13428-012-0299-5
Article Google Scholar
Boccignone, G., Ferraro, M.: Modelling gaze shift as a constrained random walk. Phys. A: Stat. Mech. Appl. 331(1–2), 207–218 (2004)
Article Google Scholar
Boccignone, G., Ferraro, M.: Gaze shifts as dynamical random sampling. In: Proceedings of 2nd European Workshop on Visual Information Processing (EUVIP 2010), pp. 29–34. IEEE Press (2010)
Google Scholar
Boccignone, G., Ferraro, M.: Feed and fly control of visual scanpaths for foveation image processing. Ann. Telecommun. annales des télécommunications 68(3–4), 201–217 (2013)
Google Scholar
Boccignone, G., Ferraro, M.: Ecological sampling of gaze shifts. IEEE Trans. Cybern. 44(2), 266–279 (2014)
Article Google Scholar
Boccignone, G., Cuculo, V., D’Amelio, A.: Problems with saliency maps. In: Ricci, E., Rota Bulò, S., Snoek, C., Lanz, O., Messelodi, S., Sebe, N. (eds.) ICIAP 2019. LNCS, vol. 11752, pp. 35–46. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30645-8_4
Chapter Google Scholar
Boccignone, G., Cuculo, V., D’Amelio, A., Grossi, G., Lanzarotti, R.: Give ear to my face: modelling multimodal attention to social interactions. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11130, pp. 331–345. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11012-3_27
Chapter Google Scholar
Borji, A., Itti, L.: State-of-the-art in visual attention modeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 185–207 (2013)
Article Google Scholar
Bruce, N.D., Wloka, C., Frosst, N., Rahman, S., Tsotsos, J.K.: On computational modeling of visual saliency: examining what’s right, and what’s left. Vis. Res. 116, 95–112 (2015)
Article Google Scholar
Bylinskii, Z., DeGennaro, E., Rajalingham, R., Ruda, H., Zhang, J., Tsotsos, J.: Towards the quantitative evaluation of visual attention models. Vis. Res. 116, 258–268 (2015)
Article Google Scholar
Bylinskii, Z., Judd, T., Oliva, A., Torralba, A., Durand, F.: What do different evaluation metrics tell us about saliency models? IEEE Trans. Pattern Anal. Mach. Intell. 41(3), 740–757 (2019)
Google Scholar
Bylinskii, Z., Recasens, A., Borji, A., Oliva, A., Torralba, A., Durand, F.: Where should saliency models look next? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 809–824. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_49
Chapter Google Scholar
Cerf, M., Frady, E.P., Koch, C.: Faces and text attract gaze independent of the task: experimental data and computer model. J. Vis. 9(12), 1–15 (2009)
Article Google Scholar
Coutrot, A., Guyader, N.: An audiovisual attention model for natural conversation scenes. In: Proceedings of the IEEE International Conference on Image Processing (ICIP), pp. 1100–1104. IEEE (2014)
Google Scholar
Cristino, F., Mathôt, S., Theeuwes, J., Gilchrist, I.D.: ScanMatch: a novel method for comparing fixation sequences. Behav. Res. Methods 42(3), 692–700 (2010)
Article Google Scholar
Cuculo, V., D’Amelio, A., Lanzarotti, R., Boccignone, G.: Personality gaze patterns unveiled via automatic relevance determination. In: Mazzara, M., Ober, I., Salaün, G. (eds.) STAF 2018. LNCS, vol. 11176, pp. 171–184. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04771-9_14
Chapter Google Scholar
Girshick, R., Radosavovic, I., Gkioxari, G., Dollár, P., He, K.: Detectron (2018). https://github.com/facebookresearch/detectron
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Google Scholar
Hu, P., Ramanan, D.: Finding tiny faces. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1522–1530. IEEE (2017)
Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254–1259 (1998)
Article Google Scholar
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: IEEE 12th International Conference on Computer Vision, pp. 2106–2113. IEEE (2009)
Google Scholar
Kummerer, M., Wallis, T.S., Gatys, L.A., Bethge, M.: Understanding low-and high-level contributions to fixation prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4789–4798 (2017)
Google Scholar
Le Meur, O., Coutrot, A.: Introducing context-dependent and spatially-variant viewing biases in saccadic models. Vis. Res. 121, 72–84 (2016)
Article Google Scholar
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dokl. 10, 707–710 (1966)
MathSciNet Google Scholar
Li, X., Wang, W., Hou, W., Liu, R.Z., Lu, T., Yang, J.: Shape robust text detection with progressive scale expansion network. arXiv preprint arXiv:1806.02559 (2018)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Napoletano, P., Boccignone, G., Tisato, F.: Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy. IEEE Trans. Image Process. 24(11), 3266–3281 (2015)
Article MathSciNet MATH Google Scholar
Nguyen, T.V., Zhao, Q., Yan, S.: Attentive systems: a survey. Int. J. Comput. Vis. 126(1), 86–110 (2018)
Article Google Scholar
Oliva, A., Torralba, A.: Building the gist of a scene: the role of global image features in recognition. Prog. Brain Res. 155, 23–36 (2006)
Article Google Scholar
Rothkegel, L.O., Trukenbrod, H.A., Schütt, H.H., Wichmann, F.A., Engbert, R.: Temporal evolution of the central fixation bias in scene viewing. J. Vis. 17(13), 3 (2017)
Article Google Scholar
Schütt, H.H., Rothkegel, L.O., Trukenbrod, H.A., Engbert, R., Wichmann, F.A.: Disentangling bottom-up versus top-down and low-level versus high-level influences on eye movements over time. J. Vis. 19(3), 1 (2019)
Article Google Scholar
Tatler, B.W., Hayhoe, M.M., Land, M.F., Ballard, D.H.: Eye guidance in natural scenes: reinterpreting salience. J. Vis. 11(5), 1–23 (2011)
Article Google Scholar
Tatler, B., Vincent, B.: The prominence of behavioural biases in eye guidance. Vis. Cogn. 17(6–7), 1029–1054 (2009)
Article Google Scholar
Tavakoli, H.R., Borji, A., Anwer, R.M., Rahtu, E., Kannala, J.: Bottom-up attention guidance for recurrent image recognition. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 3004–3008. IEEE (2018)
Google Scholar
Torralba, A., Oliva, A., Castelhano, M., Henderson, J.: Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol. Rev. 113(4), 766 (2006)
Article Google Scholar
Tseng, P.H., Carmi, R., Cameron, I.G., Munoz, D.P., Itti, L.: Quantifying center bias of observers in free viewing of dynamic natural scenes. J. Vis. 9(7), 4 (2009)
Article Google Scholar
Xia, C., Han, J., Qi, F., Shi, G.: Predicting human saccadic scanpaths based on iterative representation learning. IEEE Trans. Image Process. 28(7), 3502–3515 (2019)
Article MathSciNet MATH Google Scholar
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
Zanca, D., Gori, M.: Variational laws of visual attention for dynamic scenes. In: Advances in Neural Information Processing Systems, pp. 3823–3832 (2017)
Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929 (2016)
Google Scholar
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

PHuSe Lab - Dipartimento di Informatica, University of Milan, Milan, Italy
Giuseppe Boccignone, Vittorio Cuculo & Alessandro D’Amelio

Authors

Giuseppe Boccignone
View author publications
You can also search for this author in PubMed Google Scholar
Vittorio Cuculo
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro D’Amelio
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vittorio Cuculo .

Editor information

Editors and Affiliations

McMaster University, Hamilton, ON, Canada
Emil Sekerinski
University of Porto, Porto, Portugal
Nelma Moreira
University of Minho, Braga, Portugal
José N. Oliveira
Argo Ai, Munich, Germany
Daniel Ratiu
University of Pisa, Pisa, Italy
Riccardo Guidotti
University of Liverpool, Liverpool, UK
Marie Farrell
University of Liverpool, Liverpool, UK
Matt Luckcuck
University of Exeter, Exeter, UK
Diego Marmsoler
University of Minho, Braga, Portugal
José Campos
University of Newcastle, Newcastle upon Tyne, UK
Troy Astarte
Claude Bernard University, Lyon, France
Laure Gonnord
Nazarbayev University, Nur-Sultan, Kazakhstan
Antonio Cerone
University of Surrey, Guildford, UK
Luis Couto
University of Surrey, Guildford, UK
Brijesh Dongol
University of Giessen, Giessen, Germany
Martin Kutrib
University of Lisbon, Lisbon, Portugal
Pedro Monteiro
Airbus Operations S.A.S., Toulouse, France
David Delmas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Boccignone, G., Cuculo, V., D’Amelio, A. (2020). How to Look Next? A Data-Driven Approach for Scanpath Prediction. In: Sekerinski, E., et al. Formal Methods. FM 2019 International Workshops. FM 2019. Lecture Notes in Computer Science(), vol 12232. Springer, Cham. https://doi.org/10.1007/978-3-030-54994-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-54994-7_10
Published: 13 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54993-0
Online ISBN: 978-3-030-54994-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics