Abstract

This paper describes and evaluates a tool to measure, compute and display performitnce data of any machine in a wide-area, heterogeneous, distributed computing system. The monitoring tool is easily deployable, easily extensible, and silpports various degrees of centralization without significant redesign effort. The tool leverages widely used network protocols (SNMP) for communication, thus being a,pplicable to many distributed systems. The paper discusses the design and development of a monitoring system1 (SIMONE) and -the performance studies conducted to evaluate the tool. SIMONE; consists of a ma.nager which requests, receives, and processes data from individual machines or hosts and presents the results to the user. The manager is designed to measure a set of performance parameters determined useful in a network-computing environment. The hosts of the target system run daemons which service requests from the manager. The reply to each request consists of the variable values obtained from the host. Perfc~rmance measurements carried out on the prototype SIMONE are reported and compared to similar measurements using alternate monitoring methods. ]Performance metfics include resolution of monitored measurements, latency between data request and -presentation, and communication and CPU overheads. The performance of SIMOPiE shows significant improvement (better resolution, less latency, lower overhead) over that of alternate monitoring methods.

Date of this Version

July 2000

Share

COinS