How to Sort in Rexx



by Howard Fosdick   © 2024 RexxInfo.org


Sorting

Introduction

How do you sort a list in Rexx? I ran into this challenge the other day. This article summarizes what I learned. It also provides free Rexx code for the ten most popular sorting algorithms (written by Christian Maas).


Background

There are a couple dozen widely known sorting algorithms. All share a common goal -- to take an unordered list (or table or array) of data items, and to sort them into a specified order. The comparison operator determines how items will be sorted. Popular sorts include alphabetic, numeric, and alphanumeric. Sorts might be either ascending or descending order.

Sorts can be in-place, requiring no extra space other than that of the original array, or they can take advantage of "work space" to improve speed.

Internal sorts are performed solely in memory, so they are speediest. The alternative is external sorting, which requires disk space. External sorts are necessary when internal memory is not sufficient for sorting a large data set.

Sometimes you'll also hear the term internal applied to a sort performed within a user program. In this case, external refers to a sort performed outside that program. In other words, an "external" sort means that the user program calls an operating system sort or some other sorting utility to perform sorting on its behalf.

Which sort is best? That really depends on your individual requirements. Sorts are often judged by their speed, but how much space they require could also be a consideration. And there may be other factors, too. For example, some sorting algorithms can capitalize on data that is semi-sorted to increase performance. Another example: some sorts are better adapted to specific data streams than others.

The bottom line is that if you really want to know which sort performs best, you have to test it with your own data on your own computer. You can shorten the process by testing the most promising algorithms first. This article should help in that decision.


Free Rexx Sorting Programs

While contemplating whether I'd have to write a Rexx sort program myself, I ran across an excellent code set written by Christian Maas. His Rexx programs implement ten of the most popular sorts. They're easy to use, clearly written, and ready to go. Mr Maas generously distributes them as freeware under this license.

These free Rexx sorting programs are generic. They run on almost any major platform under any major operating system. They can handle almost any sorting need you have. Use them "as is", or employ them as easy-to-understand models for writing your own code.

This chart summarizes Mr Maas' programs, the kinds of sorts they implement, and their relative speeds. Following the chart, I'll discuss the sorting algorithms and how they work.


Sort: Program Name: Speed: My Speed:Rank:Code:
          
Bubble Sort bubble_sort.rexx .025938 .011654 8 here
Comb Sort comb_sort.rexx .006709 .002317 4/3 here
Gnome Sort gnome_sort.rexx .022870 .010377 7 here
Heap Sort heap_sort.rexx .007101 .003311 5 here
Merge Sort merge_sort.rexx .005170 .002436 3/4 here
Quick Sort quick_sort.rexx .003052 .001572 1 here
Selection Sort selection_sort.rexx .011232 .005899 6 here
Shell Sort shell_sort.rexx .003809 .001940 2 here
External OS Sort external_OS_sort.rexx   n/a   n/a n/a here


Performance

The two columns in the table for Speed convey the elasped time based on Rexx TIME functions issued within the programs before and after the sort. (Using this Rexx function, instead of operating-system time calls, makes the programs portable and operating-system independent.) All times are in seconds.

Obviously, the performance you get will depend on the machine you run on. The numbers in the column labeled Speed: are directly taken from the internal comments Mr Maas provides for each program.

The column labeled My Speed: resulted from my own runs of the programs. I ran each five times while user activities were quiesced, and averaged the results together. All programs ran using the same data provided by Mr Maas. (The desktop computer had an Intel i7-3770 3.4 ghz processor and 8 gigabytes of memory. The operating system was Linux Mint 21.3 and the Rexx interpreter was Regina Rexx version 3.9.5.)

The column labeled Rank: shows the relative speed of the algorithms. Quick and Shell sort are fastest, while Gnome and Bubble sort are slowest. Comb and Merge sorts trade positions in Mr Maas' tests versus my own.


The Sorting Algorithms

Here's a brief description of each of the sorting algorithms. You can find a more detailed summary of each in this article. Or read their Wikipedia entries.

Bubble Sort: Bubble sort is generally considered one of the simplest but slowest sorting algorithms (as proved in the above chart). It's unsuitable for large data sets. The way it works is that it compares two items side-by-side and exchanges them if necessary. It traverses the list many times to achieve completion, which is why it is slow. It's called Bubble sort because items bubble up or down the list during passes of the data.

Comb Sort: Comb sort is an improved version of the Bubble sort. Whereas Bubble sort always compares adjacent items, Comb sort improves this by increasing the gap size to more than 1.

Gnome Sort: Like Bubble sort, Gnome sort is generally considered one of the slower sort algorithms. This sort inspects two adjacent data items. If they are in the correct order, advance to the next item. Otherwise, step one item backwards.

Heap Sort: This sort converts the data set into a heap data structure. It then manipulates the heap to sort the items. The algorithm swaps the root of the heap (the tree) with the last item, removes the last item, and reheapifies remaining items.

Merge Sort: Merge sort is a "divide and conquer" algorithm. Merge sort recursively divides the input into smaller lists, sorts those subarrays, and then merges them back together.

Quick Sort: Quick sort is typically the fastest sorting algorithm of them all. Like Merge sort, it employs a divide and conquer approach. Quick sort picks an element as a pivot and partitions the given array around the pivot by placing the pivot in its correct position in the sorted array.

Selection Sort: This sort can be useful when memory is limited. It divides the list into unsorted and sorted portions. Then it repeatedly selects the largest (or smallest) item from the unsorted portion of the list and moves it into position in the sorted portion of the list.

Shell Sort: Shell sort compares far-apart items in the list and exchanges them if needed. It then progressively reduces the gap between items to compare. In our performance tests it came in second behind Quick sort.

External OS Sort: In this sort, the Rexx program invokes the operating system's SORT command to sort the data. Obviously this performs differently depending on the operating system and how its SORT command operates. With our small test data set, if invoking the SORT command means externalizing the data to disk, this sort will always perform the slowest, since disk I/O is always slower than in-memory activity. On the other hand, if the Rexx program can ask the OS to sort in memory, for example, through piping or array passing, then this sort might be competitive with the others listed here.


Conclusion

This brief overview of Rexx sorting gives you a rough idea as to which sorting algorithms are likely to be speediest and why. You can use the freeware programs "as is", or as models for your own sorting programs.

----------------------------
Special thanks to Christian Maas for use of his freeware code. Picture at top courtesy of Prepbytes Blog.