Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations

This article concerns the expressive power of depth in neural nets with ReLU activations and a bounded width. We are particularly interested in the following questions: What is the minimal width <inline-formula> <math display="inline"> <semantics> <mrow> <msub>...

Full description

Bibliographic Details
Main Author:	Boris Hanin
Format:	Article
Language:	English
Published:	MDPI AG 2019-10-01
Series:	Mathematics
Subjects:	deep neural nets relu networks approximation theory
Online Access:	https://www.mdpi.com/2227-7390/7/10/992

id	doaj-b9fb5da8d50b4557b1734a61e866ec72
record_format	Article
spelling	doaj-b9fb5da8d50b4557b1734a61e866ec722020-11-25T01:56:43ZengMDPI AGMathematics2227-73902019-10-0171099210.3390/math7100992math7100992Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU ActivationsBoris Hanin0Department of Mathematics, Texas A&M, College Station, TX 77843, USAThis article concerns the expressive power of depth in neural nets with ReLU activations and a bounded width. We are particularly interested in the following questions: What is the minimal width <inline-formula> <math display="inline"> <semantics> <mrow> <msub> <mi>w</mi> <mi>min</mi> </msub> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> </semantics> </math> </inline-formula> so that ReLU nets of width <inline-formula> <math display="inline"> <semantics> <mrow> <msub> <mi>w</mi> <mi>min</mi> </msub> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> </semantics> </math> </inline-formula> (and arbitrary depth) can approximate any continuous function on the unit cube <inline-formula> <math display="inline"> <semantics> <msup> <mrow> <mo stretchy="false">[</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo stretchy="false">]</mo> </mrow> <mi>d</mi> </msup> </semantics> </math> </inline-formula> arbitrarily well? For ReLU nets near this minimal width, what can one say about the depth necessary to approximate a given function? We obtain an essentially complete answer to these questions for convex functions. Our approach is based on the observation that, due to the convexity of the ReLU activation, ReLU nets are particularly well suited to represent convex functions. In particular, we prove that ReLU nets with width <inline-formula> <math display="inline"> <semantics> <mrow> <mi>d</mi> <mo>+</mo> <mn>1</mn> </mrow> </semantics> </math> </inline-formula> can approximate any continuous convex function of <i>d</i> variables arbitrarily well. These results then give quantitative depth estimates for the rate of approximation of any continuous scalar function on the <i>d</i>-dimensional cube <inline-formula> <math display="inline"> <semantics> <msup> <mrow> <mo stretchy="false">[</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo stretchy="false">]</mo> </mrow> <mi>d</mi> </msup> </semantics> </math> </inline-formula> by ReLU nets with width <inline-formula> <math display="inline"> <semantics> <mrow> <mi>d</mi> <mo>+</mo> <mn>3</mn> </mrow> </semantics> </math> </inline-formula>.https://www.mdpi.com/2227-7390/7/10/992deep neural netsrelu networksapproximation theory
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Boris Hanin
spellingShingle	Boris Hanin Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations Mathematics deep neural nets relu networks approximation theory
author_facet	Boris Hanin
author_sort	Boris Hanin
title	Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations
title_short	Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations
title_full	Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations
title_fullStr	Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations
title_full_unstemmed	Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations
title_sort	universal function approximation by deep neural nets with bounded width and relu activations
publisher	MDPI AG
series	Mathematics
issn	2227-7390
publishDate	2019-10-01
description	This article concerns the expressive power of depth in neural nets with ReLU activations and a bounded width. We are particularly interested in the following questions: What is the minimal width <inline-formula> <math display="inline"> <semantics> <mrow> <msub> <mi>w</mi> <mi>min</mi> </msub> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> </semantics> </math> </inline-formula> so that ReLU nets of width <inline-formula> <math display="inline"> <semantics> <mrow> <msub> <mi>w</mi> <mi>min</mi> </msub> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> </semantics> </math> </inline-formula> (and arbitrary depth) can approximate any continuous function on the unit cube <inline-formula> <math display="inline"> <semantics> <msup> <mrow> <mo stretchy="false">[</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo stretchy="false">]</mo> </mrow> <mi>d</mi> </msup> </semantics> </math> </inline-formula> arbitrarily well? For ReLU nets near this minimal width, what can one say about the depth necessary to approximate a given function? We obtain an essentially complete answer to these questions for convex functions. Our approach is based on the observation that, due to the convexity of the ReLU activation, ReLU nets are particularly well suited to represent convex functions. In particular, we prove that ReLU nets with width <inline-formula> <math display="inline"> <semantics> <mrow> <mi>d</mi> <mo>+</mo> <mn>1</mn> </mrow> </semantics> </math> </inline-formula> can approximate any continuous convex function of <i>d</i> variables arbitrarily well. These results then give quantitative depth estimates for the rate of approximation of any continuous scalar function on the <i>d</i>-dimensional cube <inline-formula> <math display="inline"> <semantics> <msup> <mrow> <mo stretchy="false">[</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo stretchy="false">]</mo> </mrow> <mi>d</mi> </msup> </semantics> </math> </inline-formula> by ReLU nets with width <inline-formula> <math display="inline"> <semantics> <mrow> <mi>d</mi> <mo>+</mo> <mn>3</mn> </mrow> </semantics> </math> </inline-formula>.
topic	deep neural nets relu networks approximation theory
url	https://www.mdpi.com/2227-7390/7/10/992
work_keys_str_mv	AT borishanin universalfunctionapproximationbydeepneuralnetswithboundedwidthandreluactivations
_version_	1724978327267049472

Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations

Similar Items