Vanishing Nodes: The Phenomena That Affects The Representation Power and The Training Difficulty of Deep Neural Networks

碩士 === 國立臺灣大學 === 電信工程學研究所 === 107 === It is well known that the problem of vanishing/exploding gradients creates a challenge when training deep networks. In this paper, we show another phenomenon, called vanishing nodes, that also increases the difficulty of training deep neural networks. As the de...

Full description

Bibliographic Details
Main Authors: Wen-Yu Chang, 張文于
Other Authors: Tsung-Nan Lin
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/74v5yy
Description
Summary:碩士 === 國立臺灣大學 === 電信工程學研究所 === 107 === It is well known that the problem of vanishing/exploding gradients creates a challenge when training deep networks. In this paper, we show another phenomenon, called vanishing nodes, that also increases the difficulty of training deep neural networks. As the depth of a neural network increases, the network''s hidden nodes show more highly correlated behavior. This correlated behavior results in great similarity between these nodes. The redundancy of hidden nodes thus increases as the network becomes deeper. We call this problem "Vanishing Nodes." This behavior of vanishing nodes can be characterized quantitatively by the network parameters, which is shown analytically to be proportional to the network depth and inversely proportional to the network width. The numerical results suggest that the degree of vanishing nodes will become more evident during back-propagation training. Finally, we show that vanishing/exploding gradients and vanishing nodes are two different challenges that increase the difficulty of training deep neural networks.