Why is real-world visual object recognition hard?

Progress in understanding the brain mechanisms underlying vision requires the construction of computational models that not only emulate the brain's anatomy and physiology, but ultimately match its performance on visual tasks. In recent years, "natural" images have become popular in t...

Full description

Bibliographic Details
Main Authors: Nicolas Pinto, David D Cox, James J DiCarlo
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2008-01-01
Series:PLoS Computational Biology
Online Access:http://europepmc.org/articles/PMC2211529?pdf=render
id doaj-f86d5f7141c943bca26752402f16ecb8
record_format Article
spelling doaj-f86d5f7141c943bca26752402f16ecb82020-11-25T02:19:18ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582008-01-0141e2710.1371/journal.pcbi.0040027Why is real-world visual object recognition hard?Nicolas PintoDavid D CoxJames J DiCarloProgress in understanding the brain mechanisms underlying vision requires the construction of computational models that not only emulate the brain's anatomy and physiology, but ultimately match its performance on visual tasks. In recent years, "natural" images have become popular in the study of vision and have been used to show apparently impressive progress in building such models. Here, we challenge the use of uncontrolled "natural" images in guiding that progress. In particular, we show that a simple V1-like model--a neuroscientist's "null" model, which should perform poorly at real-world visual object recognition tasks--outperforms state-of-the-art object recognition systems (biologically inspired and otherwise) on a standard, ostensibly natural image recognition test. As a counterpoint, we designed a "simpler" recognition test to better span the real-world variation in object pose, position, and scale, and we show that this test correctly exposes the inadequacy of the V1-like model. Taken together, these results demonstrate that tests based on uncontrolled natural images can be seriously misleading, potentially guiding progress in the wrong direction. Instead, we reexamine what it means for images to be natural and argue for a renewed focus on the core problem of object recognition--real-world image variation.http://europepmc.org/articles/PMC2211529?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Nicolas Pinto
David D Cox
James J DiCarlo
spellingShingle Nicolas Pinto
David D Cox
James J DiCarlo
Why is real-world visual object recognition hard?
PLoS Computational Biology
author_facet Nicolas Pinto
David D Cox
James J DiCarlo
author_sort Nicolas Pinto
title Why is real-world visual object recognition hard?
title_short Why is real-world visual object recognition hard?
title_full Why is real-world visual object recognition hard?
title_fullStr Why is real-world visual object recognition hard?
title_full_unstemmed Why is real-world visual object recognition hard?
title_sort why is real-world visual object recognition hard?
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2008-01-01
description Progress in understanding the brain mechanisms underlying vision requires the construction of computational models that not only emulate the brain's anatomy and physiology, but ultimately match its performance on visual tasks. In recent years, "natural" images have become popular in the study of vision and have been used to show apparently impressive progress in building such models. Here, we challenge the use of uncontrolled "natural" images in guiding that progress. In particular, we show that a simple V1-like model--a neuroscientist's "null" model, which should perform poorly at real-world visual object recognition tasks--outperforms state-of-the-art object recognition systems (biologically inspired and otherwise) on a standard, ostensibly natural image recognition test. As a counterpoint, we designed a "simpler" recognition test to better span the real-world variation in object pose, position, and scale, and we show that this test correctly exposes the inadequacy of the V1-like model. Taken together, these results demonstrate that tests based on uncontrolled natural images can be seriously misleading, potentially guiding progress in the wrong direction. Instead, we reexamine what it means for images to be natural and argue for a renewed focus on the core problem of object recognition--real-world image variation.
url http://europepmc.org/articles/PMC2211529?pdf=render
work_keys_str_mv AT nicolaspinto whyisrealworldvisualobjectrecognitionhard
AT daviddcox whyisrealworldvisualobjectrecognitionhard
AT jamesjdicarlo whyisrealworldvisualobjectrecognitionhard
_version_ 1724876936047493120