Neural network approximation: Three hidden layers are enough.

Journal Article (Journal Article)

A three-hidden-layer neural network with super approximation power is introduced. This network is built with the floor function (⌊x⌋), the exponential function (2x ), the step function (1x≥0 ), or their compositions as the activation function in each neuron and hence we call such networks as Floor-Exponential-Step (FLES) networks. For any width hyper-parameter N∈N+ , it is shown that FLES networks with width max{d,N} and three hidden layers can uniformly approximate a Hölder continuous function f on [0,1]d with an exponential approximation rate 3λ(2d)α 2-αN , where α∈(0,1] and λ>0 are the Hölder order and constant, respectively. More generally for an arbitrary continuous function f on [0,1]d with a modulus of continuity ωf (⋅), the constructive approximation rate is 2ωf (2d)2-Nf (2d2-N ). Moreover, we extend such a result to general bounded continuous functions on a bounded set E⊆Rd . As a consequence, this new class of networks overcomes the curse of dimensionality in approximation power when the variation of ωf (r) as r→0 is moderate (e.g., ωf (r)≲rα for Hölder continuous functions), since the major term to be concerned in our approximation rate is essentially d times a function of N independent of d within the modulus of continuity. Finally, we extend our analysis to derive similar approximation results in the Lp -norm for p∈[1,∞) via replacing Floor-Exponential-Step activation functions by continuous activation functions.

Full Text

Duke Authors

Cited Authors

  • Shen, Z; Yang, H; Zhang, S

Published Date

  • September 2021

Published In

Volume / Issue

  • 141 /

Start / End Page

  • 160 - 173

PubMed ID

  • 33906082

Electronic International Standard Serial Number (EISSN)

  • 1879-2782

International Standard Serial Number (ISSN)

  • 0893-6080

Digital Object Identifier (DOI)

  • 10.1016/j.neunet.2021.04.011

Language

  • eng