A High-Throughput Screening Approach to Discovering Good Forms of Biologically Inspired Visual Representation

While many models of biological object recognition share a common set of ''broad-stroke'' properties, the performance of any one model depends strongly on the choice of parameters in a particular instantiation of that model-e.g., the number of units per layer, the size of pooling...

Full description

Bibliographic Details
Main Authors: Cox, David D. (Contributor), Pinto, Nicolas (Contributor), Doukhan, David (Contributor), DiCarlo, James (Contributor)
Other Authors: Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences (Contributor), McGovern Institute for Brain Research at MIT (Contributor)
Format: Article
Language:English
Published: Public Library of Science, 2010-06-03T15:20:08Z.
Subjects:
Online Access:Get fulltext
Description
Summary:While many models of biological object recognition share a common set of ''broad-stroke'' properties, the performance of any one model depends strongly on the choice of parameters in a particular instantiation of that model-e.g., the number of units per layer, the size of pooling kernels, exponents in normalization operations, etc. Since the number of such parameters (explicit or implicit) is typically large and the computational cost of evaluating one particular parameter set is high, the space of possible model instantiations goes largely unexplored. Thus, when a model fails to approach the abilities of biological visual systems, we are left uncertain whether this failure is because we are missing a fundamental idea or because the correct ''parts'' have not been tuned correctly, assembled at sufficient scale, or provided with enough training. Here, we present a high-throughput approach to the exploration of such parameter sets, leveraging recent advances in stream processing hardware (high-end NVIDIA graphic cards and the PlayStation 3's IBM Cell Processor). In analogy to highthroughput screening approaches in molecular biology and genetics, we explored thousands of potential network architectures and parameter instantiations, screening those that show promising object recognition performance for further analysis. We show that this approach can yield significant, reproducible gains in performance across an array of basic object recognition tasks, consistently outperforming a variety of state-of-the-art purpose-built vision systems from the literature. As the scale of available computational power continues to expand, we argue that this approach has the potential to greatly accelerate progress in both artificial vision and our understanding of the computational underpinning of biological vision.
Dr. Gerald Burnett and Marjorie Burnett
McKnight Endowment for Neuroscience
Rowland Institute at Harvard
National Institutes of Health (U.S.) (NEI R01EY014970)