Towards understanding residual neural networks

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019 === Cataloged from PDF version of thesis. === Includes bibliographical references (page 37). === Residual networks (ResNets) are now a prominent architecture in the field of deep l...

Full description

Bibliographic Details
Main Author: Zeng, Brandon.
Other Authors: Aleksander Ma̧dry.
Format: Others
Language:English
Published: Massachusetts Institute of Technology 2019
Subjects:
Online Access:https://hdl.handle.net/1721.1/123067
Description
Summary:Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019 === Cataloged from PDF version of thesis. === Includes bibliographical references (page 37). === Residual networks (ResNets) are now a prominent architecture in the field of deep learning. However, an explanation for their success remains elusive. The original view is that residual connections allows for the training of deeper networks, but it is not clear that added layers are always useful, or even how they are used. In this work, we find that residual connections distribute learning behavior across layers, allowing resnets to indeed effectively use deeper layers and outperform standard networks. We support this explanation with results for network gradients and representation learning that show that residual connections make the training of individual residual blocks easier. === by Brandon Zeng. === M. Eng. === M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science