Energy-Quality Scalable Memory-Frugal Feature Extraction for Always-On Deep Sub-mW Distributed Vision

In this work, an energy-quality (EQ) scalable and memory-frugal architecture for video feature extraction is introduced to reduce circuit complexity, power and silicon area. Leveraging on the inherent resiliency of vision against noise and inaccuracies, the proposed approach introduces properly sele...

Full description

Bibliographic Details
Main Authors: Anastacia Alvarez, Gopalakrishnan Ponnusamy, Massimo Alioto
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8966284/
Description
Summary:In this work, an energy-quality (EQ) scalable and memory-frugal architecture for video feature extraction is introduced to reduce circuit complexity, power and silicon area. Leveraging on the inherent resiliency of vision against noise and inaccuracies, the proposed approach introduces properly selected EQ tuning knobs to reduce the energy of feature extraction at graceful quality degradation. As opposed to prior art, the proposed architecture enables the adjustment of such knobs, and adapts its cycle-level timing to reduce the amount of computation per frame at lower quality targets. As further benefit, the approach adds opportunities for energy reduction via aggressive voltage scaling. The proposed architecture mitigates the traditionally dominant area/energy of the on-chip memory by reducing the number of pixels stored on chip, introducing memory access reuse and on-the-fly computation. At the same time, EQ tuning preserves the ability to conventionally operate at maximum quality, when required by the task or the visual context. A 0.55 mm<sup>2</sup> testchip in 40nm exhibits power down to 82&#x03BC;W at 5fps frame rate (i.e., 33X lower than prior art), while assuring successful object detection at VGA resolution. To the best of the authors' knowledge, this is the first feature extractor with sub-mW operation and sub-mm<sup>2</sup> area, making the proposed approach well suited for tightly power-constrained and low-cost distributed vision systems (e.g., video sensor nodes).
ISSN:2169-3536