Ultra-low-power Design and Implementation of Application-specific Instruction-set Processors for Ubiquitous Sensing and Computing

The feature size of transistors keeps shrinking with the development of technology, which enables ubiquitous sensing and computing. However, with the break down of Dennard scaling caused by the difficulties for further lowering supply voltage, the power density increases significantly. The consequen...

Full description

Bibliographic Details
Main Author: Ma, Ning
Format: Doctoral Thesis
Language:English
Published: KTH, Industriell och Medicinsk Elektronik 2015
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-174896
http://nbn-resolving.de/urn:isbn:978-91-7595-692-3
id ndltd-UPSALLA1-oai-DiVA.org-kth-174896
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-kth-1748962015-10-10T04:55:45ZUltra-low-power Design and Implementation of Application-specific Instruction-set Processors for Ubiquitous Sensing and ComputingengMa, NingKTH, Industriell och Medicinsk ElektronikKTH, VinnExcellence Center for Intelligence in Paper and Packaging, iPACKStockholm2015The feature size of transistors keeps shrinking with the development of technology, which enables ubiquitous sensing and computing. However, with the break down of Dennard scaling caused by the difficulties for further lowering supply voltage, the power density increases significantly. The consequence is that, for a given power budget, the energy efficiency must be improved for hardware resources to maximize the performance. Application-specific integrated circuits (ASICs) obtain high energy efficiency at the cost of low flexibility for various applications, while general-purpose processors (GPPs) gain generality at the expense of efficiency. To provide both high energy efficiency and flexibility, this dissertation explores the ultra-low-power design of application-specific instruction-set processors (ASIP) for ubiquitous sensing and computing. Two application scenarios, i.e. high-throughput compute-intensive processing for multimedia and low-throughput low-cost processing for Internet of Things (IoT) are implemented in the proposed ASIPs. Multimedia stream processing for human-computer interaction is always featured with high data throughput. To design processors for networked multimedia streams, customizing application-specific accelerators controlled by the embedded processor is exploited. By abstracting the common features from multiple coding algorithms, video decoding accelerators are implemented for networked multi-standard multimedia stream processing. Fabricated in 0.13 $\mu$m CMOS technology, the processor running at 216 MHz is capable of decoding real-time high-definition video streams with power consumption of 414 mW. When even higher throughput is required, such as in multi-view video coding applications, multiple customized processors will be connected with an on-chip network. Design problems are further studied for selecting the capability of single processors, the number of processors, the capacity of communication network, as well as the task assignment schemes. In the IoT scenario, low processing throughput but high energy efficiency and adaptability are demanded for a wide spectrum of devices. In this case, a tile processor including a multi-mode router and dual cores is proposed and implemented. The multi-mode router supports both circuit and wormhole switching to facilitate inter-silicon extension for providing on-demand performance. The control-centric dual-core architecture uses control words to directly manipulate all hardware resources. Such a mechanism avoids introducing complex control logics, and the hardware utilization is increased. Programmable control words enable reconfigurability of the processor for supporting general-purpose ISAs, application-specific instructions and dedicated implementations. The idea of reducing global data transfer also increases the energy efficiency. Finally, a single tile processor together with network of bare dies and network of packaged chips has been demonstrated as the result. The processor implemented in 65 nm low leakage CMOS technology and achieves the energy efficiency of 101.4 GOPS/W for each core. <p>QC 20151009</p>Doctoral thesis, comprehensive summaryinfo:eu-repo/semantics/doctoralThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-174896urn:isbn:978-91-7595-692-3TRITA-ICT, 1653-6363 ; 15:11application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Doctoral Thesis
sources NDLTD
description The feature size of transistors keeps shrinking with the development of technology, which enables ubiquitous sensing and computing. However, with the break down of Dennard scaling caused by the difficulties for further lowering supply voltage, the power density increases significantly. The consequence is that, for a given power budget, the energy efficiency must be improved for hardware resources to maximize the performance. Application-specific integrated circuits (ASICs) obtain high energy efficiency at the cost of low flexibility for various applications, while general-purpose processors (GPPs) gain generality at the expense of efficiency. To provide both high energy efficiency and flexibility, this dissertation explores the ultra-low-power design of application-specific instruction-set processors (ASIP) for ubiquitous sensing and computing. Two application scenarios, i.e. high-throughput compute-intensive processing for multimedia and low-throughput low-cost processing for Internet of Things (IoT) are implemented in the proposed ASIPs. Multimedia stream processing for human-computer interaction is always featured with high data throughput. To design processors for networked multimedia streams, customizing application-specific accelerators controlled by the embedded processor is exploited. By abstracting the common features from multiple coding algorithms, video decoding accelerators are implemented for networked multi-standard multimedia stream processing. Fabricated in 0.13 $\mu$m CMOS technology, the processor running at 216 MHz is capable of decoding real-time high-definition video streams with power consumption of 414 mW. When even higher throughput is required, such as in multi-view video coding applications, multiple customized processors will be connected with an on-chip network. Design problems are further studied for selecting the capability of single processors, the number of processors, the capacity of communication network, as well as the task assignment schemes. In the IoT scenario, low processing throughput but high energy efficiency and adaptability are demanded for a wide spectrum of devices. In this case, a tile processor including a multi-mode router and dual cores is proposed and implemented. The multi-mode router supports both circuit and wormhole switching to facilitate inter-silicon extension for providing on-demand performance. The control-centric dual-core architecture uses control words to directly manipulate all hardware resources. Such a mechanism avoids introducing complex control logics, and the hardware utilization is increased. Programmable control words enable reconfigurability of the processor for supporting general-purpose ISAs, application-specific instructions and dedicated implementations. The idea of reducing global data transfer also increases the energy efficiency. Finally, a single tile processor together with network of bare dies and network of packaged chips has been demonstrated as the result. The processor implemented in 65 nm low leakage CMOS technology and achieves the energy efficiency of 101.4 GOPS/W for each core. === <p>QC 20151009</p>
author Ma, Ning
spellingShingle Ma, Ning
Ultra-low-power Design and Implementation of Application-specific Instruction-set Processors for Ubiquitous Sensing and Computing
author_facet Ma, Ning
author_sort Ma, Ning
title Ultra-low-power Design and Implementation of Application-specific Instruction-set Processors for Ubiquitous Sensing and Computing
title_short Ultra-low-power Design and Implementation of Application-specific Instruction-set Processors for Ubiquitous Sensing and Computing
title_full Ultra-low-power Design and Implementation of Application-specific Instruction-set Processors for Ubiquitous Sensing and Computing
title_fullStr Ultra-low-power Design and Implementation of Application-specific Instruction-set Processors for Ubiquitous Sensing and Computing
title_full_unstemmed Ultra-low-power Design and Implementation of Application-specific Instruction-set Processors for Ubiquitous Sensing and Computing
title_sort ultra-low-power design and implementation of application-specific instruction-set processors for ubiquitous sensing and computing
publisher KTH, Industriell och Medicinsk Elektronik
publishDate 2015
url http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-174896
http://nbn-resolving.de/urn:isbn:978-91-7595-692-3
work_keys_str_mv AT maning ultralowpowerdesignandimplementationofapplicationspecificinstructionsetprocessorsforubiquitoussensingandcomputing
_version_ 1716826979183886336