RCNX: Residual Capsule Next

Arjun Narukkanchira Anilkumar, Purdue University

Abstract

Machine learning models are rising every day. Most of the Computer Vision oriented machine learning models arise from Convolutional Neural Network’s(CNN) basic structure. Machine learning developers use CNNs extensively in Image classification, Object Recognition, and Image segmentation. Although CNN produces highly compatible models with superior accuracy, they have their disadvantages. Estimating pose and transformation for computer vision applications is a difficult task for CNN. The CNN’s functions are capable of learning only shift-invariant features of an image. These limitations give machine learning developers motivation towards generating more complex algorithms. Search for new machine learning models led to Capsule Networks. This Capsule Network was able to estimate objects’ pose in an image and recognize transformations to these objects. Handwritten digit classification is the task for which capsule networks are to solve at the initial stages. Capsule Networks outperforms all models for the MNIST dataset for handwritten digits, but to use Capsule networks for image classification is not a straightforward multiplication of parameters. By replacing the Capsule Network’s initial layer, a simple Convolutional Layer, with complex architectures in CNNs, authors of Residual Capsule Network achieved a tremendous change in capsule network applications without a high number of parameters. This thesis focuses on improving this recent Residual Capsule Network (RCN) to an extent where accuracy and model size is optimal for the Image classification task with a benchmark of the CIFAR-10 dataset. Our search for an exemplary capsule network led to the invention of RCN2: Residual Capsule Network 2 and RCNX: Residual Capsule NeXt. RCNX, as the next generation of RCN. They outperform existing architectures in the domain of Capsule networks, focusing on image classification such as 3-level RCN, DCNet, DC Net++, Capsule Network, and even outperforms compact CNNs like MobileNet V3. RCN2 achieved an accuracy of 85.12% with 1.95 Million parameters, and RCNX achieved 89.31% accuracy with 1.58 Million parameters on the CIFAR-10 benchmark.

Degree

M.Sc.

Advisors

El-Sharkawy, Purdue University.

Subject Area

Aerospace engineering|Artificial intelligence|Mathematics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS