Storing matrices on disk: Theory and practice revisited
Conference Paper
We consider the problem of storing arrays on disk to support scalable data analysis involving linear algebra. We propose Linearized Array B-tree, or LAB-tree, which supports flexible array layouts and automatically adapts to varying sparsity across parts of an array and over time. We reexamine the B-tree splitting strategy for handling insertions and the flushing policy for batching updates, and show that common practices may in fact be suboptimal. Through theoretical and empirical studies, we propose alternatives with good theoretical guarantees and/or practical performance. © 2011 VLDB Endowment.
Duke Authors
Cited Authors
- Zhang, Y; Munagala, K; Yang, J
Published Date
- August 1, 2011
Published In
Volume / Issue
- 4 / 11
Start / End Page
- 1075 - 1086
Electronic International Standard Serial Number (EISSN)
- 2150-8097
Citation Source
- Scopus