Thoth in action: Memory management in modern data analytics
Allocation and usage of memory in modern data-processing platforms is based on an interplay of algorithms at multiple levels: (i) at the resource-management level across containers allocated by resource managers like Mesos and Yarn, (ii) at the container level among the OS and processes such as the Java Virtual Machine (JVM), (iii) at the framework level for caching, aggregation, data shuffes, and application data structures, and (iv) at the JVM level across various pools such as the Young and Old Generation as well as the heap versus off-heap. We use Thoth, a data-driven platform for multi-system cluster management, to build a deep understanding of different interplays in memory management options. Through multiple memory management apps built in Thoth, we demonstrate how Thoth can deal with multiple levels of memory management as well as multi-tenant nature of clusters.
Duke Scholars
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- 4605 Data management and data science
- 0807 Library and Information Studies
- 0806 Information Systems
- 0802 Computation Theory and Mathematics
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- 4605 Data management and data science
- 0807 Library and Information Studies
- 0806 Information Systems
- 0802 Computation Theory and Mathematics