Top-down vs. bottom-up approaches to computational modeling of vision

Tuesday, October 26, 12:00 – 14:00 ET // Register here

Computational models fulfill many roles including theory specification, causal explanation, prediction, and visualization. Over the last decade, modeling has become increasingly elaborate and complex. While more complex models are impressive in their predictive power, they can be theoretically underwhelming, and are difficult to relate to underlying neural circuitry. In this session the goals, strengths, and weaknesses of a wide variety of state-of-the-art modeling approaches will be compared.

This special session is being hosted by the Clinical Vision Sciences Technical Group and the Applications of Visual Science Technical Group along with the Fall Vision Meeting Planning Committee.

Invited Speakers:

  • Mark Lescroart, University of Nevada, Reno

  • Stephanie Palmer, The University of Chicago

  • Tatyana Sharpee, Salk Institute for Biological Studies

  • Fred Rieke, University of Washington

Moderator:

  • Ione Fine, University of Washington


Limits of prediction accuracy on randomly selected natural images for model evaluation

Mark Lescroart, Cognitive & Brain Sciences, University of Nevada, Reno

Prediction accuracy on held-out data has become a critical analysis for quantitative model evaluation and hypothesis testing in computational cognitive neuroscience. In this talk, I will discuss the limits of prediction accuracy as a standalone metric, and highlight other considerations for model evaluation and interpretation. First, comparing two models on prediction accuracy alone does not reveal the degree to which the models have common underlying factors. I will advocate addressing this issue with variance partitioning, a form of commonality analysis, to reveal shared and unique variance explained by different models. Concretely, I will show how variance partitioning reveals representation of body parts and object boundaries in responses to multiple data sets of movie stimuli. Second, prediction accuracy is a metric for the variance explained by a given model. But for any experiment, the stimulus constrains the variance in the measured brain responses. Any given stimulus set runs the risk of excluding important sources of variation. A popular way to address this issue is to use photographs or movie clips as stimuli. Such naturalistic stimuli are typically sampled broadly from the world and thus have increased ecological validity, but random selection of natural stimuli often results in correlated features both within and between models. This often leads to ambiguous results, e.g. shared variance between models intended to capture different types of features. Furthermore, I will show that the same models (again of body parts and object boundaries) can yield quantitatively and in some cases qualitatively different results when applied to different data sets. This raises a critical question: if results for the same model vary across stimulus sets, which result provides a more solid basis for future work? Just as two clocks telling different times need a reference clock to be set, I will argue that we need broadly sampled sets of natural stimuli to use as a baseline for what, in various feature domains, constitutes "natural” variation and covariation. I will describe our collaboration to create just such a dataset of human visual experience, in the form of hundreds of hours of first-person video.


How behavioral and evolutionary constraints sculpt early visual processing

Stephanie Palmer, Department of Organismal Biology and Anatomy, Department of Physics, University of Chicago

An animal eye is only as efficient as the organism’s behavioral constraints demand it to be. While efficient coding has been a successful organizational principle in vision, to make a more general theory, behavioral, mechanistic, and even evolutionary constraints need to be added to this framework. In our work, we use a mix of known computational hurdles and detailed behavioral measurements to add constraints to the notion of optimality in vision. Accurate visual prediction is one such constraint. Prediction is essential for interacting fluidly and accurately with our environment because of the delays inherent to all brain circuits. In order to interact appropriately with a changing environment, the brain must respond not only to the current state of sensory inputs but must also make rapid predictions of the future. In our work, we explore how our visual system makes these predictions, starting as early as the eye. We borrow techniques from statistical physics and information processing to assess how we get terrific, predictive vision from these imperfect (lagged and noisy) component parts. To test whether the visual system performs optimal predictive compression and computation, we compute the past and future stimulus information in populations of retinal ganglion cells, and in the vertical motion sensing system of the fly. In the fly, we anchor our calculations with careful measurements from the Dickinson group on fast evasive flight maneuvers. This survival-critical behavior requires fast and accurate control of flight, which we show can be achieved by visual prediction in the fly vertical sensing system, via a specific wiring motif. Moving on from behavior, developing a general theory of the evolution of computation is a current research direction in our group. We use the repeated evolution of tetra-chromatic color vision in butterflies to test hypotheses about whether neural computations contain shadows of the evolutionary history of the organism.