CortexNet: A Robust Predictive Deep Neural Network Trained on Videos

Alfredo Canziani, Purdue University


In the past five years we have observed the rise of incredibly well performing feed-forward neural networks trained with supervision for vision related tasks. These models have achieved super-human performance on object recognition, localisation, and detection in still images. However, there is a need to identify the best strategy to employ these networks with temporal visual inputs and obtain a robust and stable representation of video data. Inspired by the human visual system, I propose a deep neural network family, CortexNet, which features not only bottom-up feed-forward connections, but also it models the abundant top-down feedback and lateral connections, which are present in our visual cortex. I introduce two training schemes — the unsupervised MatchNet and weakly supervised TempoNet modes — where a network learns how to correctly anticipate a subsequent frame in a video clip or the identity of its predominant subject, by learning egomotion cues and how to automatically track several objects in the current scene. Find the project website at




Culurciello, Purdue University.

Subject Area

Neurosciences|Biomedical engineering|Artificial intelligence

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server