Summary: | As an unsupervised representation problem in deep learning, next-frame prediction is a new, promising direction of research in computer vision, predicting possible future images by presenting historical image information. It provides extensive application value in robot decision making and autonomous driving. In this paper, we introduce recent state-of-the-art next-frame prediction networks and categorize them into two architectures: sequence-to-one architecture and sequence-to-sequence architecture. After comparing these approaches by analyzing the network architecture and loss function design, the pros and cons are analyzed. Based on the off-the-shelf data-sets and the corresponding evaluation metrics, the performance of the aforementioned approaches is quantitatively compared. The future promising research directions are pointed out at last.
|