The dynamic nature of large-size Network Computing Systems (NCSs) and the varying monitoring demands from the end-users pose serious challenges for monitoring systems (MSs). A statically configured MS initially adjusted to perform optimally may end performing poorly. A reconfiguration mechanism for a distributed MS is proposed. It enables the MS to react to changes in the available resources, operating conditions, and monitoring requirements, while maintaining high performance and low monitoring overheads. The distributed MS is organized as a tree, consisting of managed nodes running agents, one or more levels of intermediate-level managers (ILMs), and a top-level manager (TLM) for overall control. A localized decision process involves two adjacent ILM levels. The current values of a local node performance parameter called temperature are used in determining the transformations (merge, split, migrate) for each ILM. The implementation uses SNMP primitives for easy integration in SIMONE, a distributed SNMP-based monitoring system. The interactions between the MS elements and different classes of jobs are studied by defining a queuing model, and by evaluating different configuration schemes using simulation. Results for the static and reconfigurable schemes indicate that reconfiguration improves performance in terms of lower processing delays at the ILMs.


monitoring, network-computer systems, reconfiguration, SNMP

Date of this Version

January 2002