Protecting Production Systems from Performance Anomalies
Performance bugs are frequently observed in commodity software. While performance profilers and other source code-based tools can be used at the development stage where a program is diagnosed in a well-defined environment, many performance problems survive such a stage and affect production runs. It is also a challenging problem to be solved in the production environment where only program binaries are available with limited to zero knowledge of the source code. This problem is compounded by the significant integration of third-party software into most large scale applications and the needs of performance diagnostic tools with minimal overhead. In light of these challenges, we propose two novel systems for troubleshooting performance problems in production environments: IntroPerf and PerfGuard. In particular, IntroPerf enables transparent context-sensitive performance inference and diagnoses application performance in a multi-layered scope ranging from user functions to the kernel. Evaluated with various performance bugs in multiple open source software projects, IntroPerf automatically ranks potential internal and external root causes of performance bugs with high accuracy without any prior knowledge. PerfGuard is an automated approach to analyzing application binaries and instrumenting the binary code for performance diagnosis in production. More specifically, PerfGuard automatically identifies application transactions through dynamic program analysis and injects performance assertions on the identified application transactions for efficient run-time performance monitoring and diagnosis. From our evaluation results, we demonstrate that IntroPerf and PerfGuard can effectively introspect performance anomalies in production environments with very low run-time overhead without any prior development knowledge or access to source code.
Zhang, Purdue University.
Off-Campus Purdue Users:
To access this dissertation, please log in to our