Harman Patil (Editor)

Sort (Unix)

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

In Unix-like operating systems, sort is a standard command line program that prints the lines of its input or concatenation of all files listed in its argument list in sorted order. Sorting is done based on one or more sort keys extracted from each line of input. By default, the entire input is taken as sort key. Blank space is the default field separator.

Contents

The "-r" flag will reverse the sort order.

History

Sort was part of Version 1 Unix. By Version 4 Ken Thompson had modified it to use pipes, but sort retained an option to name the output file because it was used to sort a file in place. In Version 5, Thompson invented "-" to represent standard input.

Sort a file in alphabetical order

$ cat phonebookSmith, Brett 555-4321Doe, John 555-1234Doe, Jane 555-3214Avery, Cory 555-4132Fogarty, Suzie 555-2314$ sort phonebookAvery, Cory 555-4132Doe, Jane 555-3214Doe, John 555-1234Fogarty, Suzie 555-2314Smith, Brett 555-4321

Sort by number

The -n option makes the program sort according to numerical value. The du command produces output that starts with a number, the file size, so its output can be piped to sort to produce a list of files sorted by (ascending) file size:

Columns or fields

Use the -k option to sort on a certain column. For example, use "-k 2" to sort on the second column). In old versions of sort, the +1 option made the program sort on the second column of data (+2 for the third, etc.). This usage is deprecated.

$ cat zipcodeAdam 12345Bob 34567Joe 56789Sam 45678Wendy 23456 $ sort -k 2n zipcodeAdam 12345Wendy 23456Bob 34567Sam 45678Joe 56789

Sort on multiple fields

The -k m,n option lets you sort on a key that is potentially composed of multiple fields (start at column m, end at column n):

$ cat quotafred 2000bob 1000an 1000chad 1000don 1500eric 5000$ sort -k2,2 -k1,1 quotaan 1000bob 1000chad 1000don 1500fred 2000eric 5000

Here the first sort is done using column 2. -k2,2 specifies sorting on the key starting and ending with column 2. If -k2 is used instead, the sort key would begin at column 2 and extend to the end of the line, spanning all the fields in between. The n stands for 'numeric ordering'. -k1,1 dictates breaking ties using the value in column 1, sorting alphabetically by default. Note that bob, an and chad have the same quota and are sorted alphabetically in the final output.

Sorting a pipe delimited file

$ sort -t'|' -k2 zipcodeAdam|12345Wendy|23456Bob|34567Sam|45678Joe|56789

Sorting a tab delimited file

Sorting a file with tab separated values requires a tab character to be specified as the column delimiter. This illustration uses the shell's dollar-quote notation to specify the tab as a C escape sequence.

Sort in reverse

The -r option just reverses the order of the sort:

$ sort -rk 2n zipcodeJoe 56789Sam 45678Bob 34567Wendy 23456Adam 12345

Sort in random

The GNU implementation has a -R/--random-sort option based on hashing; this is not a full random shuffle because it will sort identical lines together. A true random sort is provided by the Unix utility shuf.

Sorting algorithm

The implementation in GNU Core Utilities, used on Linux, employs the merge sort algorithm.

References

Sort (Unix) Wikipedia


Similar Topics