NEXSORT: Sorting XML in external memory
XML plays an important role in delivering data over the Internet, and the need to store and manipulate XML in its native format has become increasingly relevant. This growing need necessitates work on developing native XML operators, especially for one as fundamental as sort. In this paper we present NEXSORT, an algorithm that leverages the hierarchical nature of XML to efficiently sort an XML document in external memory. In a fully sorted XML document, children of every non-leaf element are ordered according to a given sorting criterion. Among NEXSORT's uses is in combination with structural merge as the XML version of sort-merge join, which allows us to merge large XML documents using only a single pass once they are sorted. The hierarchical structure of an XML document limits the number of possible legal orderings among its elements, which means that sorting XML is fundamentally "easier" than sorting a flat file. We prove that the I/O lower bound for sorting XML in external memory is Θ(max{n, n log