Mutual Information Based Learning Rate Decay for Stochastic Gradient Descent Training of Deep Neural Networks

This paper demonstrates a novel approach to training deep neural networks using a Mutual Information (MI)-driven, decaying Learning Rate (LR), Stochastic Gradient Descent (SGD) algorithm. MI between the output of the neural network and true outcomes is used to adaptively set the LR for the network,...

Full description

Bibliographic Details
Main Author:	Shrihari Vasudevan
Format:	Article
Language:	English
Published:	MDPI AG 2020-05-01
Series:	Entropy
Subjects:	deep neural networks stochastic gradient descent mutual information adaptive learning rate
Online Access:	https://www.mdpi.com/1099-4300/22/5/560

Internet

https://www.mdpi.com/1099-4300/22/5/560

Mutual Information Based Learning Rate Decay for Stochastic Gradient Descent Training of Deep Neural Networks

Internet

Similar Items