Storing matrices on disk: Theory and practice revisited

Conference Paper

We consider the problem of storing arrays on disk to support scalable data analysis involving linear algebra. We propose Linearized Array B-tree, or LAB-tree, which supports flexible array layouts and automatically adapts to varying sparsity across parts of an array and over time. We reexamine the B-tree splitting strategy for handling insertions and the flushing policy for batching updates, and show that common practices may in fact be suboptimal. Through theoretical and empirical studies, we propose alternatives with good theoretical guarantees and/or practical performance. © 2011 VLDB Endowment.

Duke Authors

Cited Authors

  • Zhang, Y; Munagala, K; Yang, J

Published Date

  • August 1, 2011

Published In

Volume / Issue

  • 4 / 11

Start / End Page

  • 1075 - 1086

Electronic International Standard Serial Number (EISSN)

  • 2150-8097

Citation Source

  • Scopus