During development, humans and animals learn to make sense of their visual environment based on their momentary sensory input and their internal representation of earlier short- and long-term experiences. Despite decades of behavioral and electrophysiological research, it is still not clear how this perceptual process occurs, what representations the brain uses for it, and how these internal representations are acquired through visual learning. We follow a systematic research program to clarify these issues.
Visual statistical learning in humans and animals
What do people learn when they see a novel visual scene? To investigate this question, we developed a visual statistical learning paradigm and conducted a series of adult and infant experiments showing that humans possess a fundamental ability to extract statistical regularities of unknown visual scenes automatically both in time and space from a very early age (Fiser & Aslin 2001, 2002a, 2002b, 2005). We argued that this core ability is indispensable for the formation of visual object representations in the brain from the simplest levels of luminance changes up to the level of conscious memory traces and that it constantly interacts with perceptual processes (Fiser & Lengyel 2021). To corroborate this proposal, we showed how statistical learning (SL) interacts with perceptual illusions (Fiser, Scholl & Aslin, 2007), creates object-based attention (Lengyel, Nagy & Fiser, 2021), how it supports zero-shot learning and generalization across haptic and visual modalities (Lengyel, Žalalytė, Pantelides, Ingram, Fiser, Lengyel & Wolpert, 2019), and how it is linked to active vision through eye movements (Arato, Rothkopf & Fiser 2020).
We demonstrated that despite being viewed as completely unrelated (Fiser 2009), classical perceptual learning and SL just represent two extreme versions of the same fundamental learning process (Fiser & Lengyel 2019). We also explored the developmental link between implicit SL and more explicit "rule-learning" (Nemeth, Fiser & Janacsek 2012; Nemeth Janacsek & Fiser 2013). Using patient and fMRI studies, we identified the brain structures involved in SL (Roser, Fiser, Aslin, & Gazzaniga 2011; Roser, Aslin, McKenzie, Zahra & Fiser 2015; Karuza, Emberson, Roser, Cole, Aslin, & Fiser 2017). To clarify its universality and importance, we demonstrated SL in chicks (Rosa-Salva, Fiser, Versace, Dolci, Chehaimi, Santolin, & Vallortigara 2018) and honeybees (Avarguès-Weber, Finke, Nagy, Szabó, d’Amaro, Dyer & Fiser, 2020), identifed the similarities and differences in the underlying learning mechanism and related these differences to humans' superior learning abilities.
Currently, we are expanding this line of research to four new directions. First, we are exploring hierarchical SL to trace the emergence of abstract hierarchical internal representations (Garber & Fiser in prep.). Second, we expand our SL framework to the auditory domain (Szabo, Markus & Fiser in prep) and link it to multimodal cue integration (Reguly, Nagy, Markus, & Fiser in prep). Third, we investigate the role of SL in active learning (Nagy, Arato, Rothkopf & Fiser in prep). Finally, we study SL in autistic patients and higher species other than humans to further clarify the origin of humans’ remarkable advantage in SL.
Probabilistic modeling of visual perception and statistical learning
What is the appropriate formal explanation of perception and learning? We have developed a computational framework of tightly coupled perception and learning in the brain based on the premise that cortical functioning can be best described as performing approximate probabilistic inference and learning for optimal prediction and control based on an internal model of the uncertain environment (Fiser, Berkes, Orban & Lengyel, 2010). We codified the various implementation strategies of such a probabilistic frameworks in terms of how they treat uncertainty of the input signal and argued that a fully Bayesian version that encodes and uses uncertainty of all of its latent variables is the closest to human behavior (Koblinger, Fiser & Lengyel 2021). As a first step for supporting this proposal, we used such a fully Bayesian framework and showed that the behavioral results of our visual statistical learning experiments can be captured well with such a model while it cannot be explained by recursive pairwise associative learning, (Orban, Fiser, Aslin & Lengyel, 2008). This model suggests that humans code their sensory input through an "unconscious inference" process that interprets the statistical structure of the input based on previous experience, and looks for the simplest description of this input in terms of its possible underlying causes. We extended the concept of tightly integrated probabilistic perception and learning to characterize the interaction between visual and haptic cues (Atkins, Fiser & Jacobs, 2001).
As a second step, we demonstrated that, contrary to the classic view of early visual coding with fixed receptive fields, humans encode orientation information of the visual input in a dynamic context by continuously combining sensory information with expectations derived from earlier experiences (Christensen, Bex & Fiser, 2015). Moreover, we provided evidence that orientation and position information of small contour segments are encoded independently and in a different manner, and they are combined together by using their uncertainty according to the rules of optimal cue combination (Christensen, Bex & Fiser, 2019). This suggests that uncertainty of information is represented already at the earliest level of visual processing in accordance with the fully Bayesian proposal.
Currently, we use our framework to develop experimental paradigms that can provide behavioral hallmarks on a fully Bayesian treatment of sensory information in the brain (Koblinger, Fiser & Lengyel, in prep). We are testing whether humans build up their internal representations in a coarse-to-fine manner (Fiser, Orban, Aslin & Lengyel, in prep), how this framework gives a probabilistic interpretation to contextual effects in scenes (Orban, Lengyel & Fiser, in prep), and how the dependence of the capacity of visual working memory on the number of items on a display is just a special case of a more general dependency on the complexity of the input as specified by prior experience (Lengyel, Orban, Fiser, in prep).
The Sampling Hypothesis: implementing probabilistic learning and inference in the cortex
How can probabilistic computation be realistically implemented in the brain? The probabilistic framework requires a continuous reciprocal interaction between groups of elements at different levels of the hierarchical representation in sharp contrast with the traditional feed-forward view of how visual information is processed in the cortex. However, there are very few proposals as to how such coding can be accomplished in the brain. We have developed such a proposal based on two basic hypotheses: 1) the cortex represents a generative model of the outside world, 2) neural activity can be functionally described as samples from the posterior probability of inferred causes given the visual input (Fiser, Berkes, Orban & Lengyel, 2010). In our earlier work, we have provided evidence that, both at the level of primary visual cortex and at higher areas, the representation of visual information is better conceptualized as the activity pattern of cell assemblies suited for representing probabilities rather than a set of independent feature detectors (Weliky, Fiser, Hunt & Wagner, 2003; Dobbins, Jeo, Fiser & Allman, 1998). We have also shown that the precise developmental pattern and the correlational structure of cell responses in the primary visual cortex calls in question the notion that ongoing cortical activity is accidental noise unrelated to visual coding (Fiser, Chiu & Weliky 2004).
Based on these findings, we proposed that ongoing activity is the manifestation of internal states of the brain that expresses relevant prior knowledge of the world for perception, and sensory input only modulates these states in a probabilistic manner (Fiser, Berkes, Orban & Lengyel, 2010). This proposal supports Hebb's original notion of internal dynamical states being crucial for integrating cognitive processes beyond simple stimulus-response associations potentially closing the gap between response functions and complex behavior. Using multi-electrode recordings in the developing ferret brain, we have confirmed a fundamental predictions of the proposal, namely that the distribution of spontaneous activity should converge with age to the distribution of evoked activity marginalized over natural visual stimuli (Berkes, Orban, Lengyel & Fiser, 2011). We also found that suppression of cortical neural variability in the awake, but not in the anesthetized animal is stimulus- and state-dependent, further supporting the special status of spontaneous activity in cortical processing (White, Abbott & Fiser 2012). We provided a complete mapping of our computational model onto traditional neurophysiological quantities and thereby derived and confirmed physiologically testable trial-by-trial predictions of the model (Orban, Berkes, Fiser, Lengyel 2016). We also extended the framework to hierarchical probabilistic models by incorporating a perceptual decision making task that allowed to naturally capture a number of top-down-effect-related phenomena earlier attributed to attention and cognition (Haefner, Berkes & Fiser 2016).
Currently, we are expanding the sampling framework to different modalities and species, and to dynamically changing environments. In addition, we investigate whether the framework provides correct predictions not only for normally reared but also visually deprived animals confirming the role of visual experience in developing internal representations (Savin, Chui, Lengyel & Fiser, in prep). We also developing behavioral paradigms to identify hallmarks of sampling-based probabilistic computation in the brain (Koblinger, Fiser & Lengyel, in prep). Finally, we explore the computational consequences of resource-limited approximate implementation of the sampling framework in the brain (Fiser & Koblinger, 2021).
The effects of sequential perception and learning
What are the computational consequences of sequential processing in the brain? All perceptual and cognitive tasks unfold in time with strong consequences of this sequential nature. While the sampling framework itself implies sequential computation on one scale (Orban, Berkes, Fiser, Lengyel 2016; Haefner, Berkes & Fiser 2016), additional dynamics come in play with the temporal changes in the sensory input, the internal states and goals during behavioral tasks. Until recently, research focused mostly on the short-term effects of such changes in simple setups raising the question of how the results of these investigations generalize to more complex natural situations allowing multiple long-term effects. Using a sequential decision making framework and a setup with changes in the sensory input that could be interpreted in multiple competing ways, we showed that long-term effects can influence momentary perceptions and decisions as strongly as short-term effects, and that these effects cannot be explained by models developed for describing simple sequential effects studied earlier. We also found that observers automatically develop a full generative model of their perceptual experience even in the simplest tasks, and modify this model probabilistically according to new experience. Specifically, when multiple adjustments of this model can describe their momentary experience, observer's unconsciously apply an optimal cue-combination-like arbitration, and select the adjustment that involves changing those parameters of their model that proved to be less reliable in the past (Koblinger, Arato & Fiser, in prep).
Currently, we are exploring computational models that can capture this decision making behavior and tie those models to structure learning and sampling-based implementations (Szabo & Fiser, in prep.). We also investigate how these results relate to active learning task-switching, and the exploration-exploitation trade-off (Vieira & Fiser, in prep).
Emergence of visual constancies and invariances
What is the strategy of encoding visual information in the cortex? In the classic framework of visual processing, the goal of sensory coding is to retain as much information of the input as possible, thus maintaining any constancy or invariance over the input is a sign of information loss, i.e. a failure of achieving the original goal. In the generative probabilistic framework, the goal is to develop both momentary and long-term internal world models that are suitable for achieving the organism's specific goals. Hence, discarding information is desirable because it gives the most parsimonious model, and constancies and invariances are needed because they can provide adaptive shortcuts for the most efficient route to achieving the goals. We contrasted these two frameworks in multiple steps relying, among others, on the phenomenon of size constancy and size invariance in vision. We provided evidence that a dominant proposal of the classical framework positing that the cortex encodes maximal amount of incoming information efficiently through sparsification, is not supported by neural data (Berkes, White & Fiser 2010). We have also shown that human recognition is strongly invariant to size, translation, reflection (Fiser & Biederman 2001, 1995). We demonstrated that in case of size, such invariance is adaptively emerging based on the immediate context of the visual input (Fiser, Subramaniam & Biederman, 2001) similar to what is typically found with low-level attributes, such as contrast constancy (Fiser, Bex & Makous, 2003; Fiser & Fine, in prep). We also showed that the neural coding of such size invariance is not implemented by the responses of individual cells being more size invariant at higher level of hierarchy along the ventral pathway of the cortex but by population coding (Dobbins, Jeo, Fiser & Allman, 1998).
Currently, we are investigating how 2- and 3-dimensional size invariance emerges through statistical learning processes (Nagy & Fiser, in prep; Nagy, McKenzie & Fiser, in prep) and how the emergence of size invariance and size constancy could be jointly modelled (Garber & Fiser, in prep).
Active learning and active teaching
Do people automatically assess other people’s confidence in situations and use this information for learning from them and teaching them optimally? Optimal perception, optimal learning and optimal teaching are three successively more complex levels of performing a task well under uncertainty. In optimal perception, both the sensory input and all the relevant constraints of processing this input (typically related to the physical context) are clearly specified. In optimal learning, the constraints are not fully and explicitly specified, some of them need to be acquired based on the input and additional constraints. In optimal teaching, a subset of the relevant (unspecified) constraints are related to other individuals rather than to the physical environment. We are exploring the additional unique features required as an agent progresses across these three levels, and the extent to which humans and animals possess these characteristics (Stanciu, Lengyel & Fiser, in preparation).
1. Arató J., Rothkopf C. A. & Fiser J. (2020) Learning in the eyes: specific changes in gaze patterns track explicit and implicit visual learning. bioRxiv 2020.08.03.234039 (More...)
2. Atkins JE., Fiser J. & Jacobs RA. (2001) Experience-dependent visual cue integration based on consistencies between visual and haptic percepts. Vision research 41 (4), 449-461 (More...)
3. Avarguès-Weber A., Finke V., Nagy M., Szabó T., d’Amaro D., Dyer A.G. & Fiser J (2020) Different mechanisms underlie implicit visual statistical learning in honey bees and humans. PNAS 117 (41) 25923-25934 (More...)
4. Berkes P., Orbán G., Lengyel M. & Fiser J. (2011) Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment. Science 331 (6013), 83-87 [Highly Cited Paper] (More...)
5. Christensen JH., Bex PJ. & Fiser J. (2015) Prior implicit knowledge shapes human threshold for orientation noise. Journal of vision 15 (9), 24-24 (More...)
6. Christensen JH., Bex PJ. & Fiser J. (2019) Coding of low-level position and orientation information in human naturalistic vision. PLoS ONE 14(2): e0212141 (More...)
7. Dobbins AC., Jeo RM., Fiser J. & Allman JM. (1998) Distance modulation of neural activity in the visual cortex. Science 281 (5376), 552-555 (More...)
8. Fiser J. (2009) Perceptual learning and representational learning in humans and animals. Learning & behavior 37 (2), 141-153 (More...)
9. Fiser J. & Aslin RN. (2001) Unsupervised statistical learning of higher-order spatial structures from visual scenes. Psychological science 12 (6), 499-504 (More...)
10. Fiser J. & Aslin RN. (2002) Statistical learning of higher-order temporal structure from visual shape sequences.. Journal of Experimental Psychology: Learning, Memory, and Cognition 28 (3), 458 (More...)
11. Fiser J. & Aslin RN. (2002) Statistical learning of new visual feature combinations by infants. Proceedings of the National Academy of Sciences 99 (24), 15822-15826 (More...)
12. Fiser J. & Aslin RN. (2005) Encoding multielement scenes: statistical learning of visual feature hierarchies. Journal of Experimental Psychology: General 134 (4), 521 (More...)
13. Fiser J. & Biederman I. (1995) Size invariance in visual object priming of gray-scale images. Perception 24 (7), 741-748 (More...)
14. Fiser J. & Biederman I. (2001) Invariance of long-term visual priming to scale, reflection, translation, and hemisphere. Vision Research 41 (2), 221-234 (More...)
15. Fiser J. & Lengyel G. (2019) A common probabilistic framework for perceptual and statistical learning. Current opinion in neurobiology 58, 218-228 (More...)
16. Fiser J., & Koblinger Á. (2021) A probabilistic hammer for nailing complex neural data analyses. Neuron 109 (7), pp. 1077-1079 (More...)
17. Fiser J., Berkes P., Orbán G. & Lengyel M. (2010) Statistically optimal perception and learning: from behavior to neural representations. Trends in cognitive sciences 14 (3), 119-130 [Highly Cited Paper] (More...)
18. Fiser J., Bex PJ. & Makous W. (2003) Contrast conservation in human vision. Vision Research 43 (25), 2637-2648 (More...)
19. Fiser J., Chiu C. & Weliky M. (2004) Small modulations of ongoing cortical dynamics by sensory input during natural vision, Nature 2004 Sep 30; 431:573-578. (More...)
20. Fiser J., Scholl BJ. & Aslin RN. (2007) Perceived object trajectories during occlusion constrain visual statistical learning. Psychonomic bulletin & review 14 (1), 173-178 (More...)
21. Fiser J., Subramaniam S. & Biederman I. (2001) Size tuning in the absence of spatial frequency tuning in object recognition. Vision Research 41 (15), 1931-1950 (More...)
22. Haefner RM., Berkes P. & Fiser J. (2016) Perceptual decision-making as probabilistic inference by neural sampling. Neuron 90 (3), 649-660 (More...)
23. Janacsek K., Fiser J. & Nemeth D. (2012) The best time to acquire new skills: age?related differences in implicit sequence learning across the human lifespan. Developmental science 15 (4), 496-505 (More...)
24. Karuza EA., Emberson LL., Roser ME., Cole D., Aslin RN. & Fiser J. (2017) Neural signatures of spatial statistical learning: characterizing the extraction of structure from complex visual scenes. Journal of cognitive neuroscience 29 (12), 1963-1976 (More...)
25. Koblinger, Á. Fiser J., & Lengyel M. (2021) Representations of uncertainty: where art thou? Current Opinion in Behavioral Sciences 38, pp. 150-162 (More...)
26. Lengyel G., Nagy M., & Fiser J. (2021) Statistically defined visual chunks engage object-based attention. Nature communications 12 (1), pp. 1-12 (More...)
27. Lengyel G., Zalalyte G., Pantelides A., Ingram JN., Fiser J., Lengyel M. & Wolpert DM. (2019) Unimodal statistical learning produces multimodal object-like representations. eLife 2019;8:e43942 (More...)
28. Nemeth D., Janacsek K. & Fiser J. (2013) Age-dependent and coordinated shift in performance between implicit and explicit skill learning. Frontiers in computational neuroscience 7, 147 (More...)
29. Orbán G., Berkes P., Fiser J. & Lengyel M. (2016) Neural variability and sampling-based probabilistic representations in the visual cortex. Neuron 92 (2), 530-543 (More...)
30. Orbán G., Fiser J., Aslin RN. & Lengyel M. (2008) Bayesian learning of visual chunks by human observers. Proceedings of the National Academy of Sciences 105 (7), 2745-2750 (More...)
31. Rosa-Salva O., Fiser J., Versace E., Dolci C., Chehaimi S., Santolin C. & Vallortigara G. (2018) Spontaneous learning of visual structures in domestic chicks. Animals 8 (8), 135 (More...)
32. Roser ME., Aslin RN., McKenzie R., Zahra D. & Fiser J. (2015) Enhanced visual statistical learning in adults with autism. Neuropsychology 29 (2), 163 (More...)
34. Weliky M., Fiser J., Hunt RH. & Wagner DN. (2003) Coding of natural scenes in primary visual cortex. Neuron 37 (4), 703-718 (More...)
35. White B., Abbott LF. & Fiser J. (2012) Suppression of cortical neural variability is stimulus-and state-dependent. Journal of neurophysiology 108 (9), 2383-2392 (More...)