Harman Patil (Editor)

Mainframe sort merge

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

The Sort/Merge utility is a mainframe program to sort records in a file into a specified order, merge pre-sorted files into a sorted file, or copy selected records. Internally, these utilities use one or more of the standard sorting algorithms, often with proprietary fine-tuned code.

Mainframes were originally supplied with limited main memory by today's standards and the amount of data to be sorted was frequently very large. Because of this, unlike more recent sort programs, early Sort/Merge programs placed great emphasis on efficient techniques for sorting data on secondary storage, typically tape or disk. In 1968 the OS/360 Sort/Merge program provided five different "sequence distribution techniques" that could be used depending on the number and type of devices available.

In 1990 IBM introduced a new merge algorithm called BLOCKSET in DFSORT the successor to OS/360 Sort/Merge. Of historical note, the BLOCKSET algorithm was invented by an IBM Systems Engineer in 1963 and was discovered in IBM's archives and implemented in 1990.

Sort/Merge is very frequently used; often the most commonly used application program in a mainframe shop generally consuming about twenty percent of the processing power of the shop.

Modern Sort/Merge programs also can copy files, select or omit certain records, summarize records, remove duplicates, reformat records, append new data and produce reports. Indeed, most Sort/Merge applications use the wide range of additional processing capabilities, rather than purely sorting or merging records: the Sort/Merge product is a very fast way of performing input to and output from these functions. Quite a number of "user exits" are supported, and these may be load modules (i.e., a member of a library), or object decks (i.e., the output of an assembler), with the Sort/Merge application loading (load modules) or linking (object decks; termed "dynamic link editing" in DFSORT) the exit, as specified and required. Working storage datasets (i.e., SORTWK01, ..., SORTWKnn) may be disk or tape, although the BLOCKSET algorithm is restricted to disk working storage; more working storage datasets generally improves performance.

Sort/merge is important enough that there are multiple companies each selling their own sort/merge package for IBM mainframes and their z/OS, z/VM and z/VSE operating systems. The major Sort/Merge packages are:

  • DFSORT sold by IBM.
  • SyncSort sold by Syncsort, Inc.
  • CA-Sort sold by CA Technologies.
  • (Some of these also sell versions for other platforms, such as Unix, Linux, or Windows.)

    Historically, the "alias" SORT has been used to refer to IBM's Sort/Merge, and third party Sort/Merge programs (i.e., SYNCSORT, CASORT) have also adopted SORT as an alias for their product. DFSORT is usually referred to by its program name, ICEMAN (component ICE; the original OS/360 Sort/Merge program name was IERCO00, component IER, also with "alias" SORT).

    Some Basic DFSORT and SyncSort examples are mentioned in the blog http://mframes.blogspot.com mframes.blogspot.com

    References

    Mainframe sort merge Wikipedia