Harnessing heterogeneous resources for improving Internet data transfer

Himabindu Pucha, Purdue University

Abstract

The proliferation of wide-area data-intensive applications such as peer-to-peer file sharing and large web downloads has escalated the importance of high throughput reliable data transfers. This dissertation develops techniques to efficiently use available resources such as infrastructure overlay nodes, network peers and local disk in order to systematically improve Internet data transfer performance. We first focus on improving performance between a single sender-receiver pair by addressing the end-to-end feedback limitation in TCP, a popular transport protocol. Our solution, Slot, shortens the end-to-end feedback loop using resources from an infrastructure overlay network, thereby improving overall performance. This dissertation addresses design challenges in Slot for discovering an efficient overlay path in a scalable and practical fashion and for supporting multiple clients using a common overlay. At the core of Slot is a measurement infrastructure that monitors network properties for different overlay paths. The design of this infrastructure brings forth interesting questions regarding the dynamics of network properties and their time scales. This dissertation answers these questions using network delay as an example. An understanding of the dynamics of network properties and an overlay-based data transfer that exploits such knowledge together make Slot an effective way to improve data transfer between a sender-receiver pair. Though Slot improves the performance between a single sender-receiver pair, a single sender is sometimes unable to saturate the download bandwidth of a receiver, possibly because of the bandwidth asymmetry in the access links of end-hosts or network congestion in the core of the Internet. Multi-source transfers (e.g., BitTorrent) attempt to address this limited single-source problem. Our observation, however, indicates that bulk transfers are still slow despite current multi-source systems. In the second part of this dissertation, we investigate techniques that exploit additional resources to improve multi-source transfer performance. We present the design and implementation of SET, a system that locates available identical and similar sources for data objects using a constant number of lookups and inserting a constant number of mappings per object into a global database. We also consider the use of disk as an additional resource that can provide content required to complete a data transfer. Finally, we propose the design, implementation and evaluation of dsync, a file transfer system that can dynamically adapt to a wide variety of environments while using all available resources to improve transfer performance. While many transfer systems work well in their specialized context, their performance comes at the cost of generality, and they perform poorly when used elsewhere. In contrast, dsync adapts to its environment by intelligently determining which of its available resources is the best to use at any given time. Our experience shows that Slot can improve the throughput of a single sender-receiver connection by 60-100%. dsync can combine benefits from using identical and similar sources over the network with local resources from a disk to outperform existing systems by a factor of 1.4 to 5 in one-to-one and one-to-many transfers.

Degree

Ph.D.

Advisors

Hu, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS