Suvarna Garge (Editor)

External sorting

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

External sorting is a class of sorting algorithms that can handle massive amounts of data. External sorting is required when the data being sorted do not fit into the main memory of a computing device (usually RAM) and instead they must reside in the slower external memory (usually a hard drive). External sorting typically uses a hybrid sort-merge strategy. In the sorting phase, chunks of data small enough to fit in main memory are read, sorted, and written out to a temporary file. In the merge phase, the sorted subfiles are combined into a single larger file.

Other algorithms

External merge sort is not the only external sorting algorithm; there are also distribution sorts, which work by partitioning the unsorted values into smaller "buckets" that can be sorted in main memory. Like merge sort, external distribution sort also has a main-memory sibling; see bucket sort. There is a duality, or fundamental similarity, between merge- and distribution-based algorithms that can aid in thinking about sorting and other external memory algorithms. There are in-place algorithms for external sort, which require no more disk space than the original data.

References

External sorting Wikipedia