A STUDY OF MULTISTAGE INTERCONNECTION NETWORKS: DESIGN, DISTRIBUTED CONTROL, FAULT TOLERANCE, AND PERFORMANCE (ADM, IADM)

ROBERT JAMES MCMILLEN, Purdue University

Abstract

The interconnection of a large number of processors and other devices to form a large-scale parallel/distributed computing system is a research area receiving a great deal of attention. To characterize the operating environment of such systems, seven different proposed and/or built, large-scale computers are examined and described. A major consideration in the development of these parallel processing systems is the design of the interconnection network that is to provide communications among the processors and other devices. To gain proper historical perspective, 17 networks and seven classes of networks are surveyed, beginning with the early telephone switching networks. A family tree showing their relationship to each other is included. From the survey, two major classes are identified: the cube type and the data manipulator type. In the research presented, the Generalized Cube network is used to represent the former and the augmented data manipulator (ADM) and inverse ADM (IADM) networks are used to represent the latter. These three networks are examined and compared. The architecture of the switching elements for the networks is studied in depth. Different designs are compared in terms of cost and performance. Using a graph theoretic approach, Generalized Cube and ADM networks of comparable size are evaluated in terms of total cost and inherent fault tolerance or robustness. For large scale systems, it is important to distribute control of the network among its users (e.g. processors), to avoid the bottleneck that could occur in a centralized controller. Routing tag schemes are developd for the networks for this purpose. Both one-to-one and one-to-many or broadcast communications are supported. Since the ADM and IADM networks are known to have multiple source to destination paths, methods for dynamically switching between paths using routing tags are investigated to avoid busy or faulty links and switching elements. Finally, fault tolerance is considered. Methods to exploit the inherent fault tolerance of the IADM network by adding some extra hardware are explored. The affect of additional hardware on the routing tag scheme is taken into account.

Degree

Ph.D.

Subject Area

Electrical engineering

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS