Handling Vanishing Gradient Problem Using Artificial Derivative

Sigmoid function and ReLU are commonly used activation functions in neural networks (NN). However, sigmoid function is vulnerable to the vanishing gradient problem, while ReLU has a special vanishing gradient problem that is called dying ReLU problem. Though many studies provided methods to alleviat...

Full description

Bibliographic Details
Main Authors: Zheng Hu, Jiaojiao Zhang, Yun Ge
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9336631/