Ai On the Edge with Condensenext: An Efficient Deep Neural Network for Devices with Constrained Computational Resources

Priyank B Kalgaonkar, Purdue University

Abstract

Artificial Intelligence (AI) is the intellectuality demonstrated by machines, similar to natural intelligence demonstrated by living creatures of the Animalia kingdom, which involves emotionality and consciousness to a certain extent. Advances in the computer technology and access to copious amounts of data for multinomial classification due to digitization of the human society, inexpensive cameras and Internet of Things (IoT) has fueled research and development in the field of machine learning and perception, also known as computer vision. Convolutional Neural Networks (CNN) are a class of Deep Neural Networks (DNN) which is a subset of Machine Learning (ML) which in turn is a simple technique for the realization of AI. CNNs are becoming more popular in the field of computer vision for performing fundamental tasks such as image classification, object detection and image segmentation for real-world applications including, but not limited to, self-driving vehicles, robotics and Unmanned Ariel Vehicles (UAVs) commonly known as a drone. This rise in popularity of AI along with advancement in edge devices at local level such as mobile embedded computing platforms ranging from single-core single-threaded processors to multi-core multi-threaded processors have dictated the need for research and development of increasingly efficient stateof-the-art deep neural network architectures. One of the very first data and computationally intensive convolutional neural network algorithms began dominating accuracy benchmarks in 2012 [1]. With recent developments in AI and embedded systems, a desire for more efficient yet accurate inferring CNNs has flourished in recent years. This thesis proposes a neoteric variant of deep convolutional neural network architecture incorporating state-of-the-art techniques such as Depthwise Separable Convolution and Model Compression (Pruning) which results in reduction of forward FLOPs and increase in overall accuracy (decrease in error rate) resulting in an outstanding performance during training from scratch as well as during real-time image classification observed through deployment of trained weights on NXP’s BlueBox, an ARM-based autonomous embedded computing platform designed for self-driving vehicles, and benchmarked across three popular computer vision datasets: CIFAR-10, CIFAR-100 and ImageNet.

Degree

M.Sc.

Advisors

El-Sharkawy, Purdue University.

Subject Area

Artificial intelligence|Computer science|Information Technology|Web Studies

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS