Parallel file access on workstation clusters
Abstract
Efficient parallel file access requires both an interface capable of expressing parallel semantics and a network capable of supporting the communication primitives used by file systems. On workstation clusters, these are both difficult to obtain. Efficient communication primitives are difficult to build because common cluster interconnects do not provide direct support for many parallel file system operations. Moreover, cluster nodes typically run UNIX, requiring the use of the UNIX I/O interface which does not lend itself easily to parallel operations. The common approach of parallel file system implementations on clusters has been to build user-level libraries on top of either NFS or a node-located sequential file system. This approach, however, not only results in unnecessary system call overhead because the parallel libraries frequently have to make multiple system calls to implement single parallel operations, but furthermore does not allow system-level file management utilities to be used to manage files. This thesis presents new techniques that improve the efficiency of parallel file systems on clusters by addressing two key areas: kernel integration and communication efficiency. Kernel integration is achieved by using inode tunneling to allow the standard UNIX I/O interface to be used for parallel file access, and using segment trees to integrate a parallel file namespace with VFS. Communication efficiency is improved by using an Aggregate Function Network (AFN) as the cluster interconnect, which is better suited to parallel file system operations. To demonstrate these methods, a prototype file system was implemented under Linux, which shows that the techniques presented in this thesis allow for significant performance improvement over existing parallel file systems on clusters.
Degree
Ph.D.
Advisors
Dietz, Purdue University.
Subject Area
Electrical engineering|Computer science
Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server.