Los Alamos National Laboratory

Science >  LANL Institutes

National Security Education Center


Untangling Visual Object Recognition in the Brain

March 2, 2009
Time: 4:00 PM
Location: CNLS Conference Room (TA-3, SM1690, Rm. 102)

Jim DiCarlo McGovern, Institute for Brain Research and Dept. of Brain and Cognitive Sciences Massachusetts Institute of Technology

Although object recognition is fundamental to our behavior and seemingly effortless, it is a remarkably challenging computational problem because the visual system must somehow tolerate tremendous image variation produced by different views of each object (the "invariance" problem). In this talk, I will present a framework for thinking about that computational crux of object recognition and how it might be solved ("untangling" object manifolds). Our current neurophysiological evidence suggests that the primate brain accomplishes this untangling by gradually transforming its initial neuronal population representation (a photograph on the retina) to a new, explicit form of neuronal population representation at the highest level of the primate ventral visual stream (inferior temporal cortex, IT). We have recently discovered that unsupervised learning of naturally-occurring temporal contiguity cues in the visual environment can play a key role in constructing the untangling solution in IT.

The only way to know if such neuroscience results can explain visual recognition is to incorporate them into instantiated computational models. But the challenges are formidable: 1) neuroscience data do not fully constrain many of the important parameters ("details") of such models, 2) the primate visual system operates at high dimensionality and with years of natural experience, and 3) the community lacks well-defined methods of assessing the progress of such models. To approach these problems, we and our collaborators are leveraging recent advances in stream processing hardware (high-end GPUs and the Playstation 3's CellProcessor). In analogy to high-throughput screening approaches in molecular biology, we are screening among thousands of network architectures using appropriate recognition benchmarks. We found that this approach gives reproducible gains in recognition performance and it can offer insight into which model parameters are most important. As available computational power continues to expand and new neuroscience data are accquired, this approach has the potential to greatly accelerate our understanding of how the visual system accomplishes object recognition.

<< Back to calendar
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA
Inside | © Copyright 2008-09 Los Alamos National Security, LLC All rights reserved | Disclaimer/Privacy | Web Contact