Deep Network With Approximation Error Being Reciprocal of Width to Power of Square Root of Depth

A new network with super-approximation power is introduced. This network is built with Floor (⌊x⌋) or ReLU (max{0,x}) activation function in each neuron; hence, we call such networks Floor-ReLU networks. For any hyperparameters N∈N+ and L∈N+, we show that Floor-ReLU networks with width max{d,5N+13}...

Full description

Bibliographic Details
Main Authors:	Shen, Z. (Author), Yang, H. (Author), Zhang, S. (Author)
Format:	Article
Language:	English
Published:	NLM (Medline) 2021
Subjects:	article
Online Access:	View Fulltext in Publisher


LEADER	01632nam a2200169Ia 4500
001	10.1162-neco_a_01364
008	220427s2021 CNT 000 0 und d
020			\|a 1530888X (ISSN)
245	1	0	\|a Deep Network With Approximation Error Being Reciprocal of Width to Power of Square Root of Depth
260		0	\|b NLM (Medline) \|c 2021
856			\|z View Fulltext in Publisher \|u https://doi.org/10.1162/neco_a_01364
520	3		\|a A new network with super-approximation power is introduced. This network is built with Floor (⌊x⌋) or ReLU (max{0,x}) activation function in each neuron; hence, we call such networks Floor-ReLU networks. For any hyperparameters N∈N+ and L∈N+, we show that Floor-ReLU networks with width max{d,5N+13} and depth 64dL+3 can uniformly approximate a Hölder function f on [0,1]d with an approximation error 3λdα/2N-αL, where α∈(0,1] and λ are the Hölder order and constant, respectively. More generally for an arbitrary continuous function f on [0,1]d with a modulus of continuity ωf(·), the constructive approximation rate is ωf(dN-L)+2ωf(d)N-L. As a consequence, this new class of networks overcomes the curse of dimensionality in approximation power when the variation of ωf(r) as r→0 is moderate (e.g., ωf(r)≲rα for Hölder continuous functions), since the major term to be considered in our approximation rate is essentially d times a function of N and L independent of d within the modulus of continuity. © 2021 Massachusetts Institute of Technology.
650	0	4	\|a article
700	1		\|a Shen, Z. \|e author
700	1		\|a Yang, H. \|e author
700	1		\|a Zhang, S. \|e author
773			\|t Neural computation

Deep Network With Approximation Error Being Reciprocal of Width to Power of Square Root of Depth

Similar Items