Distributed systems comprising interacting services need runtime error detection to catch errors arising from software bugs, hardware errors, or unexpected operating conditions. A significant class of detection systems performs detection at the application level, based on the state of the application. For example, rule-based systems match rules against the application’s state deduced by the detection system at runtime. Many large-scale distributed applications generate a high rate of messages which can overwhelm the capacity of the detection system. An approach to handle this is sampling, that is, processing only a fraction of the messages. However, this approach leads to non-determinism with respect to the detection system’s view of what state the application is in. This in turn leads to inaccuracies in matching state-based rules causing degradation in the quality of detection. In this work, we present an approach to select the messages to sample and process such that the non-determinism is minimized. Next, we present a Hidden Markov Model-based technique to probabilistically identify which application states are most likely so that the detection system can perform rule-based detection for only those states. We demonstrate the techniques in a detection system called Monitor applied to a Java-based three-tier online banking system. The techniques do not need application modifications or a priori application model, but do require knowledge of expected application behavior to come up with the rules. We empirically evaluate accuracy and precision of detection under different load conditions and compare our solution with two other state-of-the-art systems: Pinpoint and Convolution algorithm.

Secondary Subject Category

Computer Science (0984)

Date of this Version

July 2008


Electrical and Computer Engineering

Month of Graduation


Year of Graduation



Master of Science in Electrical and Computer Engineering

Head of Graduate Program

Mark J.T. Smith

Advisor 1 or Chair of Committee

Saurabh Bagchi

Committee Member 1

Jan P. Allebach

Committee Member 2

Sanjay Rao