Abstracts

Invited Talks

New pupillometric methods for the assessment of ocular pathologies

 

Jean Lorenceau <jean.lorenceau@upmc.fr> (1), 1 - CNRS (France)

Although pupil responses to light stimulation have long been used to detect and assess ocular pathologies, the current tests and protocols suffer some limitations: lack of specificity and sensitivity, influences of non-visual factors (cognition, drugs) on pupil responses. In this presentation I shall first present the main cortical and sub-cortical structures and pathways controlling pupil sizes, summarize recent clinical and cognitive studies, before presenting novel pupillary tests that were developed and tested with patients (passed and on-going studies) together with signal analysis methods.

The gaze relational index as measure of cognitive processing depth in comprehension of complex animated diagrams: evidence of the influence of perceptual processing in building high level mental models

 

Boucheix Jean-Michel <jean-michel.boucheix@u-bourgogne.fr> (1), 1 - Laboratoire d'Etude de l'Apprentissage et du Développement (France)

Understanding complex technical dynamic processes from animated diagrams (or multimedia presentations, 2 or 3D) require building spatio-temporal relations over time between events and events chunks. In this communication we will present the results of two experimental studies on the comprehension of complex dynamic mechanical system from realistic animations. Learners can have difficulty in decomposing conventionally designed animations to obtain raw material suitable for building high quality mental models. A composition approach to designing animations based on the Animation Processing Model was developed as a principled alternative to prevailing approaches. Outcomes from studying novel and conventional animation designs (independent variable) were compared with respect to mental model quality, knowledge of local kinematics, and capacity to transfer (dependent variables). Study of a compositional animation that presented material in a contiguous fashion resulted in higher quality mental models of a piano mechanism than non-contiguous or control (conventional) versions. Why? In this presentation we will put the focus on the eye movements of participants that were recorded during the learning time in the contrasted experimental conditions. The analyses of eye movement's data consisted in the elaboration of fine grain time-locked eye gaze indicators during learning. More precisely we investigated the potential interest of a new measure of processing depth in the elaboration of causal relations and in the generation of inferences. This indicator was coined the gaze relational index, and was foundsignificantly higher in the compositional animation condition that presented material in a contiguous fashion. Further development of the eye movement analyses of these studies and the potential validity, scope, and relevance of this measure will be discussed in the light of recent research which used the gaze relational index as a measure of visual expertise in the boarder field of medicine.

Neural-Dynamic Architecture for Saliency-Driven and Memory Saccades and their Adaptation

 

Sandamirskaya Yulia <ysandamirskaya@ini.uzh.ch> (1), 1 - Institute of Neuroinformatics [Zurich] (Switzerland)

Saccadic eye movements are fast and precise movements that need to be planned based on a peripheral visual input. To enable the observed precision at speeds that hardly allow visual servoing, the saccade generating neural circuitry must undergo continual adaptation. The sensorimotor mapping that translates the visual input into the motor command that unfolds into a saccadic movement, followed by fixation, needs to be learned and updated if, e.g., muscle properties change. Such adaptation requires a neuronal architecture to stabilise representations of the visually selected saccadic targets as well as saccadic motor plans and to determine when and in which direction the mapping shall be updated. Here, I present a neural-dynamic architecture based on dynamic neural fields that enables initial learning and continual adaptation of saccadic eye movements, performed both driven by sensory input and from memory. Motor-based representation of targets is used to build-up the memory to enable planning of precise double-step saccades, despite of the shift of the retinal reference frame between the two saccades. I will show a robotic instantiation of the model that demonstrates workings of the neuronal architecture in a closed sensorimotor loop and with learning accompanying the behaviour.

Oral Presentations

Fixed-gaze head movement detection for triggering commands

 

Ju Qinjie <qinjie.ju@doctorant.ec-lyon.fr> (1), Chalon Renü¾Ž–”¼ <rene.chalon@ec-lyon.fr> (1), Derrode Stü¾Ž–”¼phane <stephane.derrode@ec-lyon.fr> (1), 1 - Laboratoire dÍnfoRmatique en Image et Systèmes dínformation (France)

In the field of human-computer interaction, mobile eye-tracking devices (carried on a pair of glasses) can be used to interact with an object remotely, in mobility, while keeping both hands free to perform the main activity. This process allows us to interact with objects beyond our reach and sometimes it can be faster than a traditional interaction. On the other hand, as the main task of the eyes is to observe the environment, it becomes difficult to differentiate a simple observation of an object in the scene from the will to interact with it (Midas touch). To solve this problem, solutions have been proposed in the literature, such as the use of voluntary eye movements, the use of smooth pursuit, or coupling the eye-tracking with a secondary device. In this context, our study focuses on the analysis of voluntary head movement when the user's eyes are fixed on the object of interest to trigger various commands. In order to evaluate the appropriateness of this approach in realistic situation and to evaluate its performance, we have conducted a test of the detection of 6 different fixed-gaze head movements on 40 people: head shaking (right, left), nodding (up, down) and tilting (right, left). During this test, we asked the participants to learn quickly these six movements, then to trigger various commands using these movements. The success rate is 70%, but this rate depends on the individuals and the gestures performed. As these movements are rarely used during the observation of an object, the problem of Midas touch can be avoided, while keeping both hands free.

Modeling Multi-stability and Fixational Eye Movements

 

Parisot Kevin <kevin.parisot@gipsa-lab.grenoble-inp.fr> (1), Chauvin Alan <alan.chauvin@univ-grenoble-alpes.fr> (2), Phlypo Ronald <ronald.phlypo@gipsa-lab.grenoble-inp.fr> (3), Zozor Steeve <steeve.zozor@gipsa-lab.grenoble-inp.fr> (1), 1 - Grenoble Images Parole Signal Automatique (France), 2 - Laboratoire de Psychologie et NeuroCognition (France), 3 - Grenoble Images Parole Signal Automatique (France)

Multi-stable perception occurs when an ambiguous stimulus drives perceptual alternations. Understanding its mechanisms has a direct impact on perceptual inference and decision making. A model proposed by Shpiro and colleagues explains the dynamics of bistable perception through neural adaptation and driving noise. Eye movement data from an experiment, in which participants observed a moving Necker cube in a continuous viewing paradigm, revealed that micro-pursuit fixational eye movements (FEM) can occur; a type of movements not accounted for in current FEM models. Our analysis also suggested that FEM can have an influence on adaptation and noise (Parisot et al., ECEM'17, Hicheur et al., JOV'13). Therefore, we propose a modeling approach that could help predict and explain away interactions between FEM and multi-stability dynamics. It is based on energy potential fields where their distortions by attractors allow the emergence of multi-stability in the spatial domain for the gaze (w.r.t. the different visual attractors), as well as in the attentional and perceptual spaces. Adaptation and noise can be used as causal forces that impact the observed dynamics of the system in a "top-down" and/or "bottom-up" manner. Perceptual memory and/or anticipation of stimulus motion can be taken into account through potential field temporal distortion. The model is able to generate all observed eye movement phenomena, and if reversed given data, could provide insight on the possible causal relationship between eye movements, perception and multi-stability. By inferring the parameters of the functions that connect the oculomotor and perceptual model spaces, it is possible to test predictions and gain insights on the active internal forces that drive the observed dynamics of multi-stability. We propose an experimental protocol that allows the gathering of initial data necessary for model inversion followed by prediction testing.

Towards a general event-detector for head-mounted eye trackers

 

Holmqvist Kenneth <Kenneth.Holmqvist@psychologie.uni-regensburg.de> (1) (2) (3), Niehorster D. C. <diederick_c.niehorster@humlab.lu.se> (4), Zemblys R. <r.zemblys@tf.su.lt> (5), 1 - Department of Psychology, Regensburg University (Germany), 2 - Faculty of Arts, Masaryk University, Brno (Czech Republic), 3 - Department of Computer Science, Bloemfontein University (South Africa), 4 - Humanities Laboratory and Department of Psychology, Lund University (Sweden), 5 - Siauliai University (Lithuania)

Algorithmic event-detection for remote and tower eye-trackers have existed for half a century. Event classification for data from these eye-trackers is now as good as or even slightly better in some algorithms than what expert human coders can perform (Zemblys et al., 2018). These new algorithms are also increasingly noise- resilient (Hessels et al., 2016) and can detect more events than before, such as PSOs and smooth pursuit (Larsson et al., 2013, 2015). In contrast, for head-mounted eye-trackers, no algorithms exist than can reliably detect fixations and saccades in the data. This is because head-mounted eye-trackers overlay head movements onto the eye-movement signal, so that saccade profiles are smoothed and fixations look like smooth pursuit, which confuses existing algorithms for monitor-based studies when used for head-mounted data (Holmqvist and Andersson, 2017, p. 244). Head-mounted eye-trackers also tend to output specific eye-movements such as the vestibular-ocular reflex or motion-induced events such as spontaneous optokinetic nystagmus and vergence that are seldom of interest in monitor-based eye tracking. If all these events could be detected, it would not only allow for correct measurement of fixations and saccades from head-mounted eye-trackers, but open up for studies that use all these event. Research with head-mounted eye-trackers is clearly held back by the lack of reliable algorithms for general event detection. In my talk, I will describe the foundations of our project to build a general event detector for head-mounted eye trackers using supervised machine learning.

Benchmark statistical models for eye movements

 

Barthelmü¾Ž–”¼ Simon <simon.barthelme@gipsa-lab.fr> (1), 1 - Grenoble Images Parole Signal Automatique (France)

Neural or mechanistic models of eye movements try to account for aspects of the data like saccade length or spatial preferences. The goal of the work presented here is to formulate minimal statistical models that account for eye movement statistics, to serve as benchmarks, or "null models" of sorts. We will explain how a simple maximum-entropy framework can be used to build statistical models that produce simulated scanpaths with the right statistics (on average): for example, scanpaths that have both the same empirical saliency and the same average saccade length as human observers.

Cognitive Ability Estimation and Reinforcement with Eye-tracking Games for Children with Multiple Disabilities

 

Didier Schwab <Didier.Schwab@imag.fr> (1), 1 - Laboratoire d'Informatique de Grenoble (France)

Children with multiple disabilities are often unable or at least have great difficulty speaking or making gestures when writing or typing on a touch screen. Most tests to assess their cognitive abilities are completely irrelevant because it is often impossible to distinguish if an absence of response is due to the misunderstanding of the question, to the impossibility of finding an answer or to the mechanical impossibility of giving an answer.  In this demonstration, we present GazePlay (gazeplay.net), a free and open-source software that gathers several mini-games playable with most eye-tracking devices. This project is led by informatics academics, and associates parents of children with multiple disabilities, open-source community developers, students and professionals working in specialized centers welcoming this public on a daily basis. These games are designed with a playful objective in mind for the children, but also with the objective of evaluating and working on some of his skills, particularly cognitive ones. The analysis of the child's gaze and its interactions with objects on the screen can thus help therapists to control possible eye problems and evaluate the children's understanding. There are more than 20 games. They aim to improve action-reaction skills (throwing cream pie, bursting bubbles...), selection (scratch cards for example) or finally memorization that can be worked for example to learn to communicate through pictograms. For each of these games, an assessment of the skill at a given time T can be deduced from the analysis of the eye traces at the end of a game, or analyzed in the longer term to follow the child's evolution.

Combining eye and mouse movement measures to infer the accumulation of evidence across eye fixations in visual search with visually degraded scenes.

 

Quü¾Ž–”¼tard Boris <boris.quetard@uca.fr> (1), Quinton Jean-Charles <quintonj@univ-grenoble-alpes.fr> (2), Colomb Michü¾Ž†”¼le <Michele.Colomb@cerema.fr> (3), Barca Laura <laura.barca@istc.cnr.it> (4), Pezzulo Giovanni <giovanni.pezzulo@istc.cnr.it> (4), Izaute Marie <marie.izaute@uca.fr> (1), Mermillod Martial <Martial.Mermillod@univ-grenoble-alpes.fr> (5), 1 - Laboratoire de Psychologie Sociale et Cognitive - Clermont Auvergne (France), 2 - Laboratoire Jean Kuntzmann, Grenoble (France), 3 - Centre d'études et d'expertise sur les risques, l'environnement, la mobilité et l'aménagement (France), 4 - Institute of Cognitive Sciences and Technologies - National Research Council (Italy), 5 - Laboratoire de Psychologie et NeuroCognition (France)

Visual search and identification of an object in the environment can be seen as decision-making processes where sensory information is accumulated across eye fixations and where the expectations about the target object and its context are integrated. In traditional perceptual decision-making models, the accumulation of information is usually modelled as a passive process. This study focuses on the contribution of eye movements for the integration of expectations about the target (its identity, its location) with degraded sensory information (with fog or artificial noise) towards the accumulation of evidence. We used the mouse-tracking paradigm, allowing to infer dynamic aspects of the decision-making process through the computer mouse movements. Target detection (target latency) and verification were measured through eye movements during visual search tasks in visually degraded scenes. Mediation models were used to assess the respective contributions of target detection and verification in the decision-making process. In Experiment 1, we manipulated the expectations about target location during visual search in indoor visual scenes. The results suggest that evidence for the target absent response may be accumulated as the target is actively but unfruitfully searched for through eye movement. However, no clear conclusion can be drawn. Experiment 2 featured the visual search of a pedestrian in foggy driving-themed scenes, where we manipulated expectations about the presence of a pedestrian (target) or a roe deer (distractor). This allowed to infer the accumulation of evidence for the target absent response when a pedestrian (incorrect response) or a roe deer (correct response) was in the scene. Taken together, these studies emphasize the distinct contributions of target detection and verification in the accumulation of evidence toward the target present and target absent responses. Furthermore, they suggest an evidence accumulation-based mechanism addressing the search termination problem.

Designing quality metrics for panoramic videos : a human-centered study based on eye-tracking

 

Balzarini Raffaella <balzarin@imag.fr> (1) (2), Nabil-Mahrous-Yacoub Sandra <sandra.nabil-mahrous-yacoub@inria.fr> (1), 1 - Inria Grenoble Rhône-Alpes (France), 2 - Laboratoire d'Informatique de Grenoble (France)

The creation of high quality panoramic videos for immersive VR content is commonly done using a rig with multiple cameras covering the required scene. Unfortunately, this setup introduces both spatial and temporal artifacts due to the difference in optical centers. As a result, designing quality metrics are becoming increasingly important to assess panoramic videos. In our research, we developed algorithms of quality metrics that are able to detect panoramic video merging errors. In order to validate these metrics, we need not only to test its reliability in computer vision conditions, but also to take into account human visual perception. Therefore, we propose a human-centered study that analyses the human's abilities in finding panoramic videos errors, in the aim of comparing these performances with those realized by the computer. We expect to be able to answer main research questions including what types of errors are mostly seen by humans? Can humans look intensely at an object of the scene, without considering it as a mistake? Do humans perceive errors that are not identified by an algorithm and vice-versa? In order to answer these research questions, our methodology involved two approaches: the study of visual attention on the objects of the panoramic scene and the study of the annotations, made by the participants to mark errors. The experimental set-up consisted of a large screen where the panoramic videos were projected (stimulus), linked to a keyboard that allowed to perform annotations. Gaze data, related to visual attention, were collected with wearable eye-tracking (Tobii Glasses 2) and processed with the gaze analysis software Tobii Pro Lab. In this study, eye-tracking had a crucial role. Preliminary results from AOI's descriptive statistics show that algorithms detect more errors than participants do and that the participants can observe anomalies without identifying them as errors.

Anticipating a volatile probabilistic bias in visual motion direction

 

Perrinet Laurent <laurent.perrinet@univ-amu.fr> (1), Pasturel Chloü¾Ž–”¼ <chloe.pasturel@gmail.com> (1), Montagnini Anna <anna.montagnini@univ-amu.fr> (1), 1 - Institut de Neurosciences de la Timone (France)

The brain has to constantly adapt to changes in the environment, for instance when a contextual probabilistic variable switched its state. For an agent interacting with such an environment, it is important to respond to such switches with the shortest delay. However, this operation has in general to be done with noisy sensory inputs and solely based on the information available at the present time. Here, we tested the ability of humans observers to accurately anticipate, with their eye movements, a target's motion direction throughout random sequences of rightward/leftward motion alternations, with random-length contextual blocks of different direction probability. Experimental results were compared to those of a probabilistic agent optimized with respect to this switching model. We found a good fit of the behaviorally observed anticipatory response compared with other models such as the leaky-integrator model. Moreover, we could also fit the level of confidence given by human observers with that provided by the model. Such results provide evidence that human observers may efficiently represent an anticipatory belief along with its precision. These support a novel approach to more generically test human cognitive abilities in uncertain and dynamic environments.

Eye-tracking data analysis using hidden semi-Markovian models to identify and characterize reading strategies

 

Olivier Brice <briceolivier1409@gmail.com> (1) (2), Durand Jean-Baptiste <jean-baptiste.durand@imag.fr> (1) (2), Guü¾Ž–”¼rin-Duguü¾Ž–”¼ Anne <anne.guerin@gipsa-lab.grenoble-inp.fr> (3), 1 - Inria Grenoble - Rhône-Alpes (France), 2 - Laboratoire Jean Kuntzmann (France), 3 - Laboratoire Grenoble Images Parole Signal Automatique (France)

Textual information search is not a homogeneous process in time, neither from a cognitive perspective nor in terms of eye-movement patterns (Simola, 2008). The research objective is to analyze eye-tracking signals acquired through participants achieving a reading task and simultaneously aiming at making a binary decision: whether a text is related or not to some theme given a priori. This activity is expected to involve several phases with contrasted oculometric characteristics, such as normal reading, scanning, careful reading, associated with different cognitive strategies, such as creation and rejection of hypotheses, confirmation and decision. We propose an analytical data-driven method based on hidden semi-Markov models (Yu, 2010), composed of two stochastic processes. The former is observed, and corresponds to eye-movement features over time, while the latter is a latent semi-Markov chain, which preconditions the first process, and is used to uncover the information acquisition strategies. Four interpretable strategies were highlighted: normal reading, fast reading, careful reading, and decision making while three classes of texts were available: strongly, moderately and not related to the theme given a priori. This interpretation was derived using the model properties such as dwell times, inter-phase transition probabilities, which characterize the latent process and change dynamics according to the relatedness of the text to the theme, and emission probabilities, which characterize the observed process per phase regardless the so-called relatedness. More importantly, model selection was performed using both, information theory criterion and some covariates, used to reinforce the interpretation.

Eye-tracking in multimodal approach for the recognition of situation: application to chess players

 

Balzarini Raffaella <balzarin@imag.fr> (1) (2), Guntz Thomas <thomas.guntz@inria.fr> (1), 1 - Inria Grenoble Rhône-Alpes (France), 2 - Laboratoire d'Informatique de Grenoble (France)

This study presents the role of a visual attention analysis, in an experimental context of multimodal observation. Our study's goal is to investigate the extent to which observations of eye-gaze, body posture, emotion and other physiological signals can be used to model the cognitive state of subjects. Our approach explores the integration of multiple sensor modalities to improve the reliability of detection of human displays of awareness and emotion. Domains of application, where artificial cognitive model based systems could provide appropriate assistance are, for instance, healthy autonomous ageing or automated training systems. The multimodal observation was deployed on people engaged in problem solving, applied to chess games. We observed chess players engaged in problem solving of increasing difficulty, playing on a Touch-Screen, against a virtual opponent on a web platform. We recorded their behavior with an experimental set-up composed of Touch-Screen computer, a Kinect and a webcam. Eye?gaze data were collected using a remote ?eye tracker' (Fovio) connected to the experimental set-up; the aggregated data were processed with analysis software (EyeWorks), which allowed to analyse gaze spots, gaze traces and ICA (Index of Cognitive Activity). The results of gaze data, associated with verbal data, collected and analyzed according to the techniques of RTA, show among experts, various and unexpected visual explorations and conceptualizations of game situations (chunks). This allows us to expand knowledge about strategies commonly deployed in these same games and to have a more precise recognition of the situation. Initial, joint results indicate that eye-gaze, body posture and emotion are good features to capture a participant's awareness of the current situation. Multimodal recordings can be used to estimate and to predict ability to respond effectively to challenging situations. This experiment also validates our equipment as a reproducible tool for the study of subjects engaged in problem solving.

Omnidirectional gaze data: feedbacks from the creating process of a new 360ü¾Œ†ˆ¼ videos, head & gaze dataset

 

David Erwan <erwan.david@univ-nantes.fr> (1), Coutrot Antoine <antoine.coutrot@ls2n.fr> (2), Perreira Da Silva Matthieu <Matthieu.Perreiradasilva@univ-nantes.fr> (1), Gutiü¾Ž–”¼rrez Jesü¾˜–˜¼s <Jesus.Gutierrez@univ-nantes.fr> (1), Le Callet Patrick <patrick.lecallet@univ-nantes.fr> (1), 1 - Laboratoire des Sciences du Numérique de Nantes (France), 2 - Laboratoire des Sciences du Numérique de Nantes (France)

Recent advances in virtual reality Head-Mounted Display (HMD) and embedded eye-tracking systems opened new opportunities for the study of visual attention. A VR headset's strongest feature is real-time display of omni-directional contents, allowing users to experience full 360° scenes thanks to rotation and translation tracking of the HMD; coupled with powerful eye-tracking technology, we are able to precisely study head and eye movements. In a recent free-viewing experiment, participants explored 360° dynamic stimuli wearing a VR headset. They watched videos lasting 20 seconds each; participants started a viewing trial either at longitude 0° or 180° (center of the equirectangular content or opposite side). Gaze and head rotation data were collected. Gathered data, processed into scanpaths (gaze data) or trajectories (head data), and saliency maps were released publicly along with 19 stimuli and a toolbox necessary for saliency maps and scanpaths/trajectories similarity measures. This dataset will be useful to the visual attention community in understanding the deployment of visual attention in dynamic 360° scenes, it is also more ecological in regards to usual experimental (neck constraints) and content (screen display) restrictions. Visual attention in VR has applications in content encoding, compression and transmission, quality evaluation, foveated rendering, etc. We propose to reflect on the development of this dataset and the theoretical and practical problematics that arose in relation with head and gaze data processing, and similarity measures in the 360° domain.

Face stimuli influence the programming of saccade amplitude

 

Kauffmann Louise <louise.kauffmann@gmail.com> (1) (2), Entzmann Lea <Lea.Entzmann@gipsa-lab.grenoble-inp.fr> (2), Peyrin Carole <carole.peyrin@univ-grenoble-alpes.fr> (1), Chauvin Alan <alan.chauvin@univ-grenoble-alpes.fr> (1), Guyader Nathalie <Nathalie.Guyader@gipsa-lab.grenoble-inp.fr> (2), 1 - Laboratoire de psychologie et neurocognition (France), 2 - Grenoble Images Parole Signal Automatique (France)

Studies using saccadic choice task (where participants have to initiate saccades toward a target image presented along with a distractor image in the opposite visual field) showed faster and larger saccades toward face targets than other stimuli. Error saccades were also found to be shorter than correct ones, suggesting an online correction. To better control saccade amplitudes, we did a new saccadic choice experiment in which participants had to saccade toward a central cross added on images. We still observed hypometric saccades (1) for saccades toward vehicles compared to faces and (2) for error saccades which were followed by corrective saccades with very short latencies. These results suggest a parallel programming of saccades toward both images. The two saccade programs interact and interfere with each other and would be weighted by the saliency of the target, affecting saccade amplitude.

Conditional bet and conditional probability: eye tracking contribution and cultural differences

 

De Gans Gabriel <g.degans@gmail.com> (1), Baratgin Jean <jbaratgin@gmail.com> (2), 1 - Cognitions Humaine et ARTificielle (France), 2 - Cognitions Humaine et ARTificielle (France)

The New Paradigm of study on reasoning argues that logic is inadequate to account for performance in reasoning tasks because reasoners use their everyday uncertain reasoning strategies, whose nature is probabilistic. This hypothesis is supported by the observations that participants assess both probability of conditional Pr(if A then C) and bet on conditional Bet(if A then C) in the same way as the conditional probability Pr(C|A), operating with Bayes' rule, and not as the probability of material conditional of the formal logic Pr(C or not-A) (Politzer et al. 2010). Baratgin et al. (2017) have shown, with an eye tracking methodology, that both probability assessments (Pr(if A then C) and Pr(C|A)) result from the same cognitive process identified with the same gaze strategy, different from the one implemented for Bayes' rule. Participants adopt an updating context of revision (where the universe is evolving) and use a minimal rule (mentally removing the elements of the worlds that are not-A). In this experiment, we observe a similar gaze strategy for conditional bet. We present a drawing of seven chips (each one can be round or square, red or blue) to participants and we ask them to assess the chances of winning a conditional bet about the shape and the color of a random chip (I bet you that if A then C). With an eye tracker, we analyse the distribution of gaze fixations in different areas of the drawing and demonstrate that the majority of participants solves this problem as if the not-A chips have been removed from the urn as in Baratgin et al. (2017) paper. We discuss how this eye tracking methodology could investigate the cultural differences observed between Eastern and Western participants in a conditional bet task (Nakamura et al., 2018).

A Framework for a Multimodal Analysis of Teaching Centered on Shared Attention and Knowledge Access

 

Dessus Philippe <philippe.dessus@univ-grenoble-alpes.fr> (1) (2), Aubineau Louise Hü¾Ž–”¼lü¾Ž†”¼na <hejlouisehelena@yahoo.fr> (2), Vaufreydaz Dominique <Dominique.Vaufreydaz@inria.fr> (3), Crowley James <James.Crowley@inria.fr> (3), 1 - Univ. Grenoble Alpes, Inria, F-38000 Grenoble (France), 2 - Univ. Grenoble Alpes, Laboratoire de Recherche sur les Apprentissages en Contexte (EA 602), F-38000 Grenoble (France), 3 - Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG, 38000 Grenoble France (France)

The effects of teaching on learning are mostly uncertain, hidden, and not immediate. Research investigating how teaching can have an impact on learning has recently been given a significant boost with signal processing devices and data mining analyses. We devised a framework for the study of teaching and learning processes which posits that lessons are composed of episodes of joint attention and access to the taught content, and that the interplay of behaviors like joint attention, actional contingency, and feedback loops compose different levels of teaching. Teaching by social tolerance, which occurs when learners (Ls) have no attentional problems but their access to the taught knowledge depends on the teacher (T). Teaching by opportunity provisioning, when Ls can be aware on the taught content but lack access to it (e.g., lack of understanding), and T builds ad hoc situations in which Ls are provided with easier content. Teaching by stimulus or local enhancement, when Ls have fully access to the content but lack attention toward it. T explicitly shows content to Ls, slows down her behaviors, tells and acts in an adapted way (e.g., motherese). A variety of devices installed in a classroom will capture and automatically characterize these events. T's and Ls' utterances and gazes will be recorded through low-cost cameras installed on 3D printed glasses, and T will wear a mobile eye tracker and a mobile microphone. Instructional material is equipped with qrcodes so that Ls' and T's video streams are processed to determine where people are looking at, and to infer the corresponding teaching levels. This novel framework will be used to analyze instructional events in ecological situations, and will be a first step to build a ?pervasive classroom?, where eye-tracking and sensor-based devices analyze a wide range of events in a multimodal and interdisciplinary way.

Neural Modelling of Antisaccade Performance of Healthy Controls, Schizophrenia and Obsessive-Compulsive Disorders Patients

 

Cutsuridis Vassilis <vcutsuridis@gmail.com> (1), 1 - School of Computer Science, University of Lincoln (United Kingdom)

In the antisaccade paradigm subjects are instructed to perform eye movements in the opposite direction from that of a peripheral visual stimulus, while fixating to a central stimulus. The paradigm requires the parallel programming of two decision processes: the suppression of an erroneous prosaccade towards the peripheral stimulus and the initiation of a volitional antisaccade to the mirror position. Although healthy controls (CTL) typically make few errors, patients suffering from schizophrenia (SCZ) and obsessive-compulsive disorder (OCD) make more errors and display increased and more variable latencies of error prosaccades and antisaccades. Deficits in the antisaccade performance of these patients are generally interpreted as an impaired top-down inhibitory signal that fails to suppress the erroneous responses. Neural network models with mutual inhibition implementing non-linear accumulation of information prior to decision making (eye movement) are presented. Two decision signals representing the volitional antisaccade from the Frontal Eye Fields (FEF) and the reactive prosaccade from the Lateral Intraparietal Area (LIP) are integrated in an one-dimensional competitive neural network of the intermediate layer of the superior colliculus. The model accurately reproduces the error rates and latency distributions of error prosaccades, antisaccades and corrected antisaccades of CTL, SCZ and OCD cohorts of subjects. The model predicts that antisaccade performance of SCZ patients is due to more noisy rate of accumulating information, but they are as confident as CTL subjects. On the other hand, the antisaccade performance of OCD patients is due to noise in the accumulation of information process, although they are less confident about their decisions than the controls. Finally, competition between the two decision processes in the superior colliculus and not a third top-down inhibitory signal that suppresses the erroneous response accounts for the antisaccade performance in healthy, schizophrenia and OCD subjects.

Posters

Attention towards emotional natural scenes during emotional and action appraisals

 

Campagne Aurü¾Ž–”¼lie <aurelie.campagne@univ-grenoble-alpes.fr> (1), Nicolas Gaü¾˜¦”¼lle <gaelle.nicolas1@phelma.grenoble-inp.fr> (1), 1 - Laboratoire de Psychologie et NeuroCognition (France)

A large number of previous studies showed that emotional stimuli engage more attention and are detected and identified faster than neutral stimuli. A higher attention is also reported for negative stimuli compared to positive stimuli. Visual processing of emotional stimuli also seems to depend on task demands. The majority of studies focused on the appraisal of the emotional experience. Using ocular measures, the present study investigated in young people how attention towards emotional natural scenes differs between two explicit cognitive appraisal tasks, one emotional, based on the self-emotional experience and one motivational, based on the tendency to action. A categorization task was also used as control task. In addition, two arousal levels of stimuli were considered in order to evaluate their relative role in task-related effects on visual processing of emotional scenes. Regardless valence and arousal level, our results revealed that both emotional and motivational tasks mainly differed from categorization task with shorter latency of the first saccade, a higher saccade number and a higher amplitude and expansion of saccades. The fixation duration only tended to be longer during identifying the tendency to action (avoidance, approach, and no action) tahn during identifying the self-emotional experience (pleasant, unpleasant, neutral). Unpleasant stimuli induced a lower amplitude and a shorter duration of the first saccade and a lower total amplitude of saccades than positive and neutral stimuli in all tasks, suggesting an exploration more local for unpleasant stimuli regardless task demands. A longer latency of the first sacade was also reported for unpleasant stimuli compared to other stimuli for only the tendency to action and categorization tasks. The effects observed were relatively preserved for both low and high arousal stimuli. The results confirm the closed link between emotion and tendency to action. 

Active neural field model of goal directed eye-movements

 

Quinton Jean-Charles <quintonj@univ-grenoble-alpes.fr> (1), Goffart Laurent <laurent.goffart@univ-amu.fr> (2), 1 - Laboratoire Jean Kuntzmann (France), 2 - CNRS (France)

For primates (including humans), interacting with objects of interest in the environment often involves their foveation, many of them not being static (e.g. other animals, relative motion due to self-induced movement). Eye movements allow the active and continuous sampling of local information, exploiting the graded precision of visual signals (e.g., due to the types and distributions of photoreceptors). Foveating and tracking targets thus requires adapting to their motion. Indeed, considering the delays involved in the transmission of retinal signals to the eye muscles, a purely reactive schema could not account for the smooth pursuit movements which maintain the target within the central visual field. Internal models have been posited to represent the future position of the target (for instance extrapolating from past observations), in order to compensate for these delays. Yet, adaptation of the sensorimotor and neural activity may be sufficient to synchronize with the movement of the target, converging to encoding its location here-and-now, without explicitly resorting to any frame of reference (Goffart et al., 2017). Committing to a distributed dynamical systems approach, we relied on a computational implementation of neural fields to model an adaptation mechanism sufficient to select, focus and track rapidly moving targets. By coupling the generation of eye-movements with dynamic neural field models and a simple learning rule, we replicated neurophysiological results that demonstrated how the monkey adapts to repeatedly observed moving targets (Bourrelly et al., 2016; Quinton & Goffart, 2018), progressively reducing the number of catch-up saccades and increasing smooth pursuit velocity (yet not going beyond the here-and-now target location). We now focus on eye-movements observed in presence of two simultaneously moving centrifugal targets (Goffart, 2016), for which the reduction to a single trajectory with some predicted dynamics (e.g., target center) is even more inappropriate.

Influence of eye-movements on multisensory stimulus localization: experiments, models and robotics applications

 

Lefort Mathieu <mathieu.lefort@univ-lyon1.fr> (1), Quinton Jean-Charles <quintonj@univ-grenoble-alpes.fr> (2), Forest Simon <simon.forest@ecl14.ec-lyon.fr> (2) (1), Techer Adrien <adrien.techer@etu.univ-lyon1.fr> (3) (1), Chauvin Alan <alan.chauvin@univ-grenoble-alpes.fr> (4), Avillac Marie <marie.avillac@univ-lyon1.fr> (3), 1 - Laboratoire d'InfoRmatique en Image et Systèmes d'information (France), 2 - Laboratoire Jean Kuntzmann (France), 3 - Centre de recherche en neurosciences de Lyon (France), 4 - Laboratoire de Psychologie et NeuroCognition (France)

To make sense of their environment, both humans and robots need to construct a consistent perception from many sources of information (including visual and auditory stimulation). Multimodal merging thus plays a key role in human perception, for instance by lowering reaction times and detection thresholds. Psychophysics experiments have shown that humans are able to fuse information in a Bayes optimal way (Ernst & Banks, 2002), weighting each modality by its precision (i.e. the inverse of its perceived variance). Weights are usually estimated a posteriori from experimental data, but the mechanisms by which agents may estimate such precision online are not well studied. Some propositions may stem from sensorimotor accounts of perception and the predictive coding framework, with actions (e.g. saccades) being used to simultaneously estimate stimulus localization and sensory precision (Friston et al., 2011). In the context of the AMPLIFIER (Active Multisensory Perception and LearnIng For InteractivE Robots) project (2018-2022), we study the mutual influence of multisensory fusion and active perception. The project combines three complementary components. First, psychophysics experiments contribute to the confirmation and refining of hypotheses, by manipulating stimuli and task constraints (e.g., audio-visual discrepancy, stimulus presentation time, number of fixations or saccades during presentation) and estimating their effect on saccadic eye movements, as well as the effects of eye movements on the localization of the target. Second, neurocomputational models based on the dynamic neural field framework provide distributed representations of stimuli, allow to replicate experimental data, and to make predictions. Finally, such models will be coupled with active decision-making and developmental sensorimotor contingencies learning to be embedded on social robotic platforms, to improve human-robot interactions by providing more natural (gaze) interactions and more appropriate reactions in complex environments.

Intelligence test solving through eye-movements and mouse-movements

 

Rivollier Guillaume <guillaume.rivollier@univ-grenoble-alpes.fr> (1) (2), Quinton Jean-Charles <quintonj@univ-grenoble-alpes.fr> (2), Gautheron Flora <floragautheron@gmail.com> (2), Smeding Annique <annique.smeding@univ-smb.fr> (1), 1 - Laboratoire Interuniversitaire de Psychologie (France), 2 - Laboratoire Jean Kuntzmann (France)

Among intelligence tests, Raven's Advanced Progressive Matrices (RAPM) are used to assess reasoning and problem-solving capabilities in adult humans through series of items of increasing difficulty. Each item is defined by a 3x3 matrix of pictograms whose spatial structure follows logical rules. One pictogram is always missing, and the participant must find the correct pictogram among 8 proposed answers. Classically, performance is based on the number of correct responses given by the participant. Yet, research has shown that performance is correlated to the strategies used to solve the items (Carpenter et al., 1990; Vigneau et al., 2006). Two extreme strategies can be defined and measured through eye-tracking: constructivism (inferring the rules from the matrix) vs. elimination (screening the answers for the most probable one). In the present work, we investigated whether the observed impact of visual active sampling on performance was specific to the test design or could be altered by constraining the interactions with the test items. For this purpose, we developed a computerized dynamic version of the RAPM test, with additional mouse interactivity options to select the visible pictograms: 1) full matrix and answers visible at all times (original design), 2) answers hidden and matrix made visible by clicking on it (and reciprocally), 3) a single line of the matrix or all answers visible at once, 4) three user-selected pictograms of the matrix or the answers visible at once. We previously replicated results from the literature by statistically predicting performance from mouse-movements. Nevertheless, the four experimental conditions allow to go further by studying how much of the eye-movement patterns are transferred to mouse-movements (e.g. due to differences in timing and motor cost), and how both types of movements are coordinated and integrated into the overall participant behavioral strategy.

Joint eye movements and EEG analysis during a saccadic choice task: study of the Spike Potential as neural marker of the performance

 

Guyader Nathalie <Nathalie.Guyader@gipsa-lab.grenoble-inp.fr> (1), Guü¾Ž–”¼rin Duguü¾Ž–”¼ Anne <Anne.Guerin@gipsa-lab.grenoble-inp.fr> (1), Entzmann Lü¾Ž–”¼a <Lea.Entzmann@gipsa-lab.grenoble-inp.fr> (1), 1 - Grenoble Images Parole Signal Automatique (France)

Using saccadic choice task, studies showed that we are able to initiate reliable saccades toward an image of a human face in just 100-120 ms (Crouzet et al., 2010, Guyader et al., 2017). During this task, two images, a face and a vehicle, are presented simultaneously on the left and right side of a screen and participants have to make a saccade as fast as possible toward a predefined target (face or vehicle). Recently, using the same task, we have found that saccades toward faces are not only faster but have also a larger amplitude than saccades directed toward vehicles. Moreover, we observed that error saccades (i.e. not directed toward the target) were smaller than correct saccades. The kinematic analysis of saccades suggests that error saccades are interrupted, to initiate a corrective saccade toward the target. The aim of the present study is to investigate such effect using the co-registration of electroencephalography. 26 participants took part in saccadic choice task experiment. Eye movement and EEG signals were synchronously recorded. Participants performed two sessions, one for the face as target and one for the vehicle as target. We compared the event-related potentials at the stimulus onset and the saccadic related potentials at the first saccade onset after stimulus presentation for the different conditions (Target: face, vehicle and Saccade Accuracy: correct, error). First results showed that the spike potential (SP) in the occipital site was not modulated by the saccade amplitude (here, the saccade amplitudes were between 3° and 10°). SP was smaller for the face condition than for the vehicle condition and for the error saccades than for the correct saccades. A trend was observed (p=0.054) for the interaction Target x Saccade Accuracy with a smaller SP for the error compared to the correct saccade only in the face condition.  

Decomposition of the Lambda-wave using EEG and eye-tracking data coregistration

 

Benerradi Johann <johann.benerradi@gmail.com>, Maurandi Victor <victor.maurandi@gmail.com>, Phlypo Ronald <ronald.phlypo@gipsa-lab.grenoble-inp.fr> (1), Rivet Bertrand <bertrand.rivet@gipsa-lab.grenoble-inp.fr> (1), Guerin-Dugue Anne <anne.guerin@gipsa-lab.fr> (1), 1 - Grenoble Images Parole Signal Automatique (France)

EEG and eye-tracking data co-registration is an interesting experimental technique to assess cognitive processes involved during ecological (everyday) visual tasks . However, in those paradigms saccadic potentials (i.e., presaccadic potential, ?spike potential?, and l-wave) overlap with potentials elicited by fixations, producing low level confounding effects on the selected components of interest depending on the experimental conditions [Nikolaev et al., 2016]. Several studies have shown that the l-wave is composed of three positive sub-components with different latencies. The first two relate to saccade onset and the third---most prominent---to saccade offset [Yagi, 1981; Thickbroom, et al., 1991]. These two late components (one onset, one offset) are distinguishable only for large-saccade amplitude (i.e., >5ü¾Œ†ˆ¼), when they are estimated by averaging, because of an overlap between them. Recent methodological studies on estimation of evoked potentials have shown the strength of linear models in decomposing the effects of temporally overlapping, but non-simultaneous, neural activities [Rivet, & Souloumiac, 2013; Bardy et al., 2014; Smith, & Kutas, 2015a, 2015b; Congedo et al., 2016; Kristensen et al., 2017]. 150 pictures with six randomly located, numbered, circular targets were presented to twenty-four participants with a sequential fixation task. A large database of eye movements with different saccade amplitudes and orientations were recorded with synchronized EEG activities. Using the General Linear Model, the potential at the stimulus onset, the eye saccade related potentials ESRP), and the eye fixation related potential (EFRP) were estimated as a function of saccade amplitude and orientation. This way, we disentangled the two late l-wave components whatever the saccade features. Moreover, even ESRP and EFRP were linearly deconvolved, and the offset saccade sub-component was observed to always be modulated by the saccade features. This modulation must not be ignored for the interpretation of this early visual component.

Generating head motion from speech activities for a humanoid robot in a collaborative task using a cascaded LSTM model by predicting gaze targets

 

Nguyen Duc-Canh <duc-canh.nguyen@gipsa-lab.grenoble-inp.fr> (1), Bailly Gerard <gerard.bailly@gipsa-lab.grenoble-inp.fr> (1), Elisei Frü¾Ž–”¼dü¾Ž–”¼ric <frederic.elisei@gipsa-lab.grenoble-inp.fr> (1), 1 - GIPSA-Lab, Grenoble-Alpes Univ. and CNRS (France)

Abstract Human eye-head coordination in gazing at targets has a long history study [1]. For example, when the head motion was freely movable, the eye saccade amplitude was a function of head velocity (e.g. the faster head, the smaller the saccade). In addition, when the head velocity was low, the saccade eyes usually moved maximum 45 degrees and are fixed until the gaze attained the target. In this work, we propose a model to generate head motion for a humanoid robot in a collaborative task where the robot plays as an instructor to inform a manipulator who will move cubes to target positions. We performed Canonical Correlation Analysis (CCA) among head motion and other modalities (speech activities, gaze, pitch (F0) and manipulator's arm and found that the gaze has the high correlation with pitch orientation of the head motion of instructor. This is expected since the speech chunks partly refer to movable regions of interest in the visual scene that are intrinsically referred through the gaze. We propose a cascade Long-Short Term Memory (LSTM) [2] model including two layers: (1) the first LSTM layer was trained to predict gaze movement and (2) the second LSTM layers will predict head motion from speech activities and manipulator's arm movements and the predicted gazes. The layers are trained independently and fine-tuning is performed to train the cascaded model. The results show that the cascaded model (with predicting gaze) can generate head motion more accuracy than the single model with the same inputs (speech activities and manipulator's arm movement).  Reference [1] D. Guitton, M. Volle, Gaze control in humans: eye-head coordination during orienting movements to targets within and beyond the oculomotor range, Journal of Neurophysiology. 58 (1987) 427?459. [2] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Computation. 9 (1997) 1735?1780.

Doctor's Gaze: A Pilot Study of Feasibility and Relevance of a Protocol using Eye Tracking

 

Alegre Marion <alegre.marion@gmail.Com> (1), Berthelot Alexandra <pmbv91@gmail.com>, Bosson Jean-Luc <bossonj@univ-grenoble-alpes.fr>, Peltier Agnü¾Ž†”¼s <agnes.peltier@univ-grenoble-alpes.fr>, Guyader Nathalie <Nathalie.Guyader@gipsa-lab.grenoble-inp.fr>, 1 - Thématique scientifique et objectifs (France)

Introduction - Doctor's gaze is essential in the doctor-patient communication. Only few studies analyzed the Doctor's gaze using qualitative approaches or quantitative with a webcam. The aim of the present study was to assess Doctor'gaze during ?real? consultation with a mobile eye-tracker and to quantify the existence of a correlation between doctor's gaze duration on the patients and the quality of the doctor-patient communication. Methods - We proposed a pilot study to evaluate the feasibility and relevance of using a mobile eye-tracker (iViewETG from SMI) in a real-world consultation.The feasibility was gathered through interview (patients and doctor). The Doctor's gaze duration on patient's face and on Doctor's PC was measured during the office interview. After the consultation, the patient filled a questionnaire to evaluate the quality of doctor-patient communication developed by Dr Sustersic. We analyzed the correlation between the gaze duration and the questionnaire's score (statistical test, Spearman coefficient). Results - 9 patients visiting their general practitioner in France in the Region of Isère were included. Unfortunately, we had to stop inclusion because the doctor reported difficulties to do his consultation with the eye tracker. We did not observe a significant correlation between doctor's gaze duration and the score (R = 0.184, p = 0.635). Discussion ? Regarding technical aspects and patient feedbacks, it seems possible to test again the use of eye tracker during consultations. The pilot study could demonstrate the interest of eye tracking as a tool of immersion in the doctor's gaze during consultations. Many applications could follow as the eye tracking use for the pedagogic development of medical students' training.

ANEMO: Quantitative tools for the ANalysis of Eye MOvements

 

Pasturel Chloü¾Ž–”¼ <chloe.pasturel@gmail.com> (1), Montagnini Anna <anna.montagnini@univ-amu.fr> (1), Perrinet Laurent <laurent.perrinet@univ-amu.fr> (1), 1 - Institut de Neurosciences de la Timone (France)

Eye movements are crucial bio-markers for a wide range of cognitive behaviours. While the recordings of such movements may be provided by low- to high-cost measurement devices, there is no unique, commonly agreed method to quantify the different phases of their dynamics. Here we focus on eye movements performed during motion tracking. Based on some prior knowledge on the dynamics of the different types of eye movements, we propose here a set of robust fitting methods for the extraction of characteristic parameters of eye movements.In particular, we show how we can robustly extract the latency, initial acceleration and steady state of visually-guided smooth pursuit eye movements, as well as the velocity ramp of anticipatory pursuit. Compared with classical methods based on local linear regression, for pursuit latency, and velocity thresholding for saccade detection, this method provides a more efficient tool for validating and categorizing tracking performance globally. We validated it on a large set of experimental data. Moreover, this code is made available as an open-source package at http://github.com/invibe/ANEMO, allowing for the community to use and modify these methods.

Multi-stability and fixational eye movements: an energy potential fields modeling approach

 

Parisot Kevin <kevin.parisot@gipsa-lab.grenoble-inp.fr> (1) (2), Chauvin Alan <alan.chauvin@univ-grenoble-alpes.fr> (1), Phlypo Ronald <ronald.phlypo@gipsa-lab.grenoble-inp.fr> (2), Zozor Steeve <steeve.zozor@gipsa-lab.grenoble-inp.fr> (2), 1 - Laboratoire de Psychologie et NeuroCognition (France), 2 - Grenoble Images Parole Signal Automatique (France)

Multi-stable perception occurs when an ambiguous stimulus drives perceptual alternations. Understanding its mechanisms has a direct impact on perceptual inference and decision making. A model proposed by Shpiro and colleagues explains the dynamics of bistable perception through neural adaptation and driving noise. Eye movement data from an experiment, in which participants observed a moving Necker cube in a continuous viewing paradigm, revealed that micro-pursuit fixational eye movements (FEM) can occur; a type of movements not accounted for in current FEM models. Our analysis also suggested that FEM can have an influence on adaptation and noise (Parisot et al., ECEM'17, Hicheur et al., JOV'13). Therefore, we propose a modeling approach that could help predict and explain away interactions between FEM and multi-stability dynamics. It is based on energy potential fields where their distortions by attractors allow the emergence of multi-stability in the spatial domain for the gaze (w.r.t. the different visual attractors), as well as in the attentional and perceptual spaces. Adaptation and noise can be used as causal forces that impact the observed dynamics of the system in a "top-down" and/or "bottom-up" manner. Perceptual memory and/or anticipation of stimulus motion can be taken into account through potential field temporal distortion. The model is able to generate all observed eye movement phenomena, and if reversed given data, could provide insight on the possible causal relationship between eye movements, perception and multi-stability.

Assessing the dynamic visual processing of informative local features with eye movements

 

Montagnini Anna <anna.montagnini@univ-amu.fr> (1), Benini Anna Paola <anna.benini@stud.unifi.it>, Del Viva Michela <maria.delviva@unifi.it>, 1 - Institut de Neurosciences de la Timone (France)

Visually salient features embedded in synthetic structured images typically attract a rapid foveating saccade even under very challenging visual conditions. However, a general definition of saliency, as well as its role for natural active vision are still matter of debate. Here we chose a specific set of local features, predicted by a constrained maximum-entropy model to be optimal information carriers (DelViva et al. 2013), as candidate salient features. These local patterns are spatial arrangements of 3x3 black and white pixels (about 9 arcmin of size). At each trial we randomly selected 10 patterns for the target stimulus (s of them being classified as salient, with s=1,4,6 or 10) and 10 non-salient patterns for the distractor. In a choice saccadic experiment we randomly presented target and distractor for 26ms on the right and left side of the screen respectively, at 5º eccentricity from the central fixation and at different angles (0º, ±45º; ±75º) with respect to the horizontal meridian. We recorded human participants' eye movements while they were asked to perform a saccade towards the most salient pattern. We estimated the oculometric target-selection curves based on the landing position of the first and second saccade with respect to the target and evaluated saccadic choice performance with respect to saccadic latencies. In addition we analyzed saccadic curvature as a possible landmark for an automatic capture of salient patterns. Results point to a dynamic evolution of oculomotor selection with a fast but imperfect attraction of salient patterns and a further refinement resulting in a more accurate second saccade for the highest values of signal to noise ratio. When analyzing the first saccade in more detail, choice accuracy improved with saccadic latency only for the highest SNR values, whereas saccadic curvature was slightly biased toward the non-targeted visual stimulus, regardless of its saliency.

Toward Eye Gaze Enhanced Information Retrieval Relevance Feedback

 

Jambon Francis <francis.jambon@imag.fr> (1), Mulhem Philippe <philippe.mulhem@imag.fr> (1), Albarede Lucas <lucas.albarede@etu.univ-grenoble-alpes.fr> (1), 1 - Laboratoire d'Informatique de Grenoble (France)

Information Retrieval (IR) is dedicated to retrieve relevant documents according to a user's query. The literature in this field shows that gathering relevance information provided by the user on the documents retrieved by the IR system increases the overall quality of the system. The relevance information provided by the user is processed to refine his/her initial query, in a process called Relevance Feedback. Since it is cumbersome and time consuming for the user to explicitly provide such information, our hypothesis is that eye gaze information could be used to implicitly estimate the user's interests, and thus help the relevance feedback mechanism. The main research question tackled here is is twofold: (1) what is the user behavioral model at the visual level in an information retrieval task, and how this model would determine the user's interests and (2) how to integrate effectively such eye gaze elements into a relevance feedback mechanism in classical IR systems that present results list with documents extracts (called snippets). To achieve this goal, we split the problem into the following steps: (a) to model the user behaviour in front of a result list composed of snippets; (b) to define the eye gaze elements to be acquired and the way to link them to the user's interests in document contents; (c) to build relevance feedback mechanisms that are able to use these elements; and (d) to experiment the proposal on a classical IR test collections to compare them to other relevance feedback approaches. The work presented here focuses on the former two elements above: we define a experimental context to gather relevant information about user's behaviour in front a result display composed of snippets, and we deduce the EM elements that will need to be acquired in order to perform IR relevance feedback.

Bayesian modeling of lexical knowledge acquisition in BRAID, a model of visual word recognition.

 

Ginestet Emilie <emilie.ginestet@univ-grenoble-alpes.fr> (1), Valdois Sylviane <sylviane.valdois@univ-grenoble-alpes.fr> (1), Diard Julien <julien.diard@univ-grenoble-alpes.fr> (1), 1 - Laboratoire de psychologie et neurocognition (France)

There are several computational models of expert word recognition, expert word naming and eye movement control but very few attempts to mathematically model reading acquisition. In particular, modelling lexical orthographic knowledge acquisition is one of the main current challenges. Our team has recently developed a new Bayesian model of word recognition, called BRAID, to simulate expert readers' performance. Here, we propose an extension of BRAID by implementing a mechanism for the acquisition of new orthographic knowledge. The BRAID model integrates an attentional component modelled by a Gaussian probability distribution; its parameters (muA ; sigmaA) respectively model the attentional focus point and the span of visual attention during letter-string processing. To model orthographic learning, we assume that visual attention displacements are chosen to optimize the accumulation of perceptual information about letters, so as to construct efficiently the new orthographic memory trace. To do this, we compute entropy gains so as to select parameters (muA ; sigmaA) of the next visual attention deployment. The results obtained during preliminary simulations suggest that, to acquire the spelling of a novel word, the model transitions from a letter-to-letter decoding strategy, when the word is first encountered, to a more global reading strategy, when some lexical information has already been memorized about the new word. The plausibility of current results will be discussed by comparison with behavioural evidence on reading acquisition. We will further discuss the limits of our current model, which assumes a strictly orthographic word recognition process, without incorporating any phonological processing. 

Demos

Demo: Tobii Pro products

 

Luu Antoine <antoine.luu@tobii.com> (1), 1 - Tobii Pro (Sweden)

Presented products : - Virtual Reality: HTC Vive with integrated Tobii eye-tracker - Mobile: Tobii Glasses 2 - Remote: Tobii X2 and X3

Demo: Cognitive Ability Estimation and Reinforcement with Eye-tracking Games for Children with Multiple Disabilities

 

Didier Schwab <Didier.Schwab@imag.fr> (1), 1 - Laboratoire d'Informatique de Grenoble (France)

Children with multiple disabilities are often unable or at least have great difficulty speaking or making gestures when writing or typing on a touch screen. Most tests to assess their cognitive abilities are completely irrelevant because it is often impossible to distinguish if an absence of response is due to the misunderstanding of the question, to the impossibility of finding an answer or to the mechanical impossibility of giving an answer.  In this demonstration, we present GazePlay (gazeplay.net), a free and open-source software that gathers several mini-games playable with most eye-tracking devices. This project is led by informatics academics, and associates parents of children with multiple disabilities, open-source community developers, students and professionals working in specialized centers welcoming this public on a daily basis. These games are designed with a playful objective in mind for the children, but also with the objective of evaluating and working on some of his skills, particularly cognitive ones. The analysis of the child's gaze and its interactions with objects on the screen can thus help therapists to control possible eye problems and evaluate the children's understanding. There are more than 20 games. They aim to improve action-reaction skills (throwing cream pie, bursting bubbles...), selection (scratch cards for example) or finally memorization that can be worked for example to learn to communicate through pictograms. For each of these games, an assessment of the skill at a given time T can be deduced from the analysis of the eye traces at the end of a game, or analyzed in the longer term to follow the child's evolution.

Online user: 1