Next generation bandwidth-efficient network codes for distributed data storage

Imad Ahmad, Purdue University

Abstract

Regenerating codes (RCs) are ingeniously designed codes that mitigate the repair-bandwidth problem of erasure codes (ECs) in distributed storage networks. The key features of RCs are: functional repair and partial information-exchange (β < α). Locally repairable codes (LRCs), on the other hand, were proposed recently to address the disk I/O overhead problem of distributed storage codes. LRCs are designed with a (usually small d < k) bounded number of helper nodes participating in repair. Most existing LRCs assume exact-repair and allow full exchange of the stored data (β = α) from the helper nodes. This means that they lack the features of the original RCs that may further reduce the repair-bandwidth. Motivated by the significant bandwidth reduction of RCs over ECs, in this thesis, the concept of “locally repairable regenerating codes (LRRCs)” is introduced that simultaneously admits all three features: local repairability, partial information-exchange, and functional repair, and significant bandwidth reduction is observed. Under the setting of LRRCs, this thesis answers the following fundamental question: under what condition does proactively choosing the helper nodes improve the storage-bandwidth tradeoff? Through a graph-based analysis, this thesis answers this question by providing a necessary and sufficient condition under which optimally choosing good helpers strictly improves the storage-bandwidth tradeoff. A low-complexity helper selection solution, termed the family helper selection (FHS) scheme, is proposed and the corresponding storage-bandwidth tradeoff is characterized. Moreover, an explicit construction of an exact-repair code is proposed that achieves the minimum-bandwidth-regenerating (MBR) point of the FHS scheme. The new exact-repair code can be viewed as a generalization of the existing fractional repetition codes. Furthermore, one important issue that needs to be addressed by any code with local repair (including both LRCs and LRRCs) is that sometimes designated helper nodes may be temporarily unavailable, the result of multiple failures, degraded reads, or other network dynamics. This thesis extends its study of LRRCs to the scenario of node unavailability and proves, for the first time in the literature, that all existing methods of helper selection can sometimes be strictly repair-bandwidth suboptimal. For some scenarios, it is necessary to combine LRRCs with a new helper selection method, termed dynamic helper selection, to achieve optimal repair-bandwidth.

Degree

Ph.D.

Advisors

Wang, Purdue University.

Subject Area

Computer Engineering|Electrical engineering|Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS