A new sub-packetization bound for minimum storage regenerating codes
Codes for distributed storage systems are often designed to sustain failure of multiple storage disks. Specifically, an (n, k) MDS code stores k symbols in n disks such that the overall system is tolerant to a failure of up to n - k disks. However, access to at least k disks is still required to repair a single erasure. To reduce repair bandwidth, array codes are used where the stored symbols or packets are vectors of length ℓ. MDS array codes can potentially repair a single erasure using a fraction l/(n - k) of data stored in the surviving nodes. We ask the following question: for a given (n, k), what is the minimum vector-length or sub-packetization factor ℓ required to achieve this optimal fraction? For exact recovery of systematic disks in an MDS code of low redundancy, i.e. k/n > 1/2, the best known explicit codes [1] have a sub-packetization factor I which is exponential in k. It has been conjectured [2] that for a fixed number of parity nodes, it is in fact necessary for ℓ to be exponential in k. In this paper, we provide new converse bounds on k for a given ℓ We prove that k ≤ ℓ2 for an arbitrary but fixed number of parity nodes r = n ™ k. For the practical case of 2 parity nodes, we prove a stronger result that k ≤ 4ℓ. © 2013 IEEE.