Training deep neural networks (DNNs) takes significant time and resources. A practice for expedited deployment is to use pre-trained deep neural networks (PTNNs), often from model zoos.collections of PTNNs; yet, the reliability of model zoos remains unexamined. In the absence of an industry standard for the implementation and performance of PTNNs, engineers cannot confidently incorporate them into production systems. As a first step, discovering potential discrepancies between PTNNs across model zoos would reveal a threat to model zoo reliability. Prior works indicated existing variances in deep learning systems in terms of accuracy. However, broader measures of reliability for PTNNs from model zoos are unexplored. This work measures notable discrepancies between accuracy, latency, and architecture of 36 PTNNs across four model zoos. Among the top 10 discrepancies, we find differences of 1.23%-2.62% in accuracy and 9%ś131% in latency. We also find mismatches in architecture for well-known DNN architectures (e.g., ResNet and AlexNet). Our findings call for future works on empirical validation, automated tools for measurement, and best practices for implementation.
Neural networks, Model zoos, Software reuse, Empirical software engineering
Date of this Version
Discrepancies among Pre-trained Deep Neural Networks: A New Threat to Model Zoo Reliability. Montes, Peerapatanapokin, Schultz, Guo, Jiang, and Davis. Proceedings of the 30th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering: Ideas, Visions, and Reflections track (ESEC/FSE-IVR) 2022.