Deep Network With Approximation Error Being Reciprocal of Width to Power of Square Root of Depth

A new network with super-approximation power is introduced. This network is built with Floor (⌊x⌋) or ReLU (max{0,x}) activation function in each neuron; hence, we call such networks Floor-ReLU networks. For any hyperparameters N∈N+ and L∈N+, we show that Floor-ReLU networks with width max{d,5N+13}...

Full description

Bibliographic Details
Main Authors: Shen, Z. (Author), Yang, H. (Author), Zhang, S. (Author)
Format: Article
Language:English
Published: NLM (Medline) 2021
Subjects:
Online Access:View Fulltext in Publisher
LEADER 01632nam a2200169Ia 4500
001 10.1162-neco_a_01364
008 220427s2021 CNT 000 0 und d
020 |a 1530888X (ISSN) 
245 1 0 |a Deep Network With Approximation Error Being Reciprocal of Width to Power of Square Root of Depth 
260 0 |b NLM (Medline)  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1162/neco_a_01364 
520 3 |a A new network with super-approximation power is introduced. This network is built with Floor (⌊x⌋) or ReLU (max{0,x}) activation function in each neuron; hence, we call such networks Floor-ReLU networks. For any hyperparameters N∈N+ and L∈N+, we show that Floor-ReLU networks with width max{d,5N+13} and depth 64dL+3 can uniformly approximate a Hölder function f on [0,1]d with an approximation error 3λdα/2N-αL, where α∈(0,1] and λ are the Hölder order and constant, respectively. More generally for an arbitrary continuous function f on [0,1]d with a modulus of continuity ωf(·), the constructive approximation rate is ωf(dN-L)+2ωf(d)N-L. As a consequence, this new class of networks overcomes the curse of dimensionality in approximation power when the variation of ωf(r) as r→0 is moderate (e.g., ωf(r)≲rα for Hölder continuous functions), since the major term to be considered in our approximation rate is essentially d times a function of N and L independent of d within the modulus of continuity. © 2021 Massachusetts Institute of Technology. 
650 0 4 |a article 
700 1 |a Shen, Z.  |e author 
700 1 |a Yang, H.  |e author 
700 1 |a Zhang, S.  |e author 
773 |t Neural computation