
Function: Sort_Array
The Sort_Array function performs a multi-level sorting of an array or a group of arrays. You can choose sorting order, and direction for each array in the sort operation. Each of the arrays needs to have the same size, but you can mix arrays of different types. The sorting keeps the rows across all the arrays synchronized.
You can prefix each array with "<" for ascending, and ">" for descending. No prefix indicates that the array has no sort option, but will follow in the result.
Prototype:
sort_array (<array1, >arrays2, array3,....,arrayN);
This will sort the arrays, ascending for array1, descending for array2, and array 3 is not sorted, but just follows the other arrays along with the rest of the arrays up to arrayN.
Parameters:
Parameter | Type | Description |
---|---|---|
array1.....arrayN | Array, any type | Comma-separated list of arrays |
prefix < | Ascending sort | |
Prefix > | Descending sort | |
No Prefix | Values is not part of the sorting algorithm, but the array is synced with the other arrays. |
Return value: No return value
Example:
In this example, we pull some fields from a database table, load them into arrays, and sort the arrays. The content of the arrays is printed before and after the sort. We also do some performance metrics that are returned from the function.
The line below is the sorting of arrays in the example:
sort_array ( < Department, < salary, < last_name, first_name);
Here is the full example:
function array_sort_test ()
long uS1, uS2, uS3;
// declare the arrays
ARRAY STRING first_name;
ARRAY STRING last_name;
ARRAY FLOAT salary;
ARRAY STRING Department;
uS1 = uSec(); // measure time for start
// Perform SQL on the employee table, and load the result into arrays.
string res = executeSQL("SELECT first_name, last_name, salary, Department FROM employee
INTO :first_name, :last_name, :salary, :Department");
uS2 = uSec(); // measure the time after SQL
// Print a table with the content before sort.
print format ("\n%-20s | %-20s | %-15s | %-10s |", "First Name", "last_name", "salary", "Department");
print "\n" + "-"*76+"\n";
int i;
for (i =1, sizeof (first_name))
print format ("\n%-20s | %-20s | %15.0f | %-10s |", first_name[i], last_name[i], salary[i], Department[i]);
end for
// perform sorting on department, salary and last name. First name just follows.
set_locale ( "nb_NO.UTF-8"); // Norwegian alphabet is used for sorting.
sort_array ( < Department, < salary, < last_name, first_name);
uS3 = uSec(); // measure time after sort
// print the table after sorting.
print "\n\nAfter sorting (Department, salary, last_name)";
print format ("\n%-20s | %-20s | %-15s | %-10s |", "First Name", "last_name", "salary", "Department");
print "\n" + "-"*76+"\n";
for (i =1, sizeof (first_name))
print format ("\n%-20s | %-20s | %15.0f | %-10s |", first_name[i], last_name[i], salary[i], Department[i]);
end for
// return the metrics.
return format ( "Sql %ld, Sort %ld, Total %ld", uS2-uS1, uS3-uS2, uS3-uS1);
end
This gave the following output:
First Name | last_name | salary | Department |
----------------------------------------------------------------------------
Efficient | SQL Developer | 30000 | DEV |
Good | Teamplayer | 35000 | SALE |
Enormous | Slacker | 45000 | ADM |
Coffee | Drinker | 30000 | DEV |
Shiny | Outfit | 60000 | DEV |
Lazy | Officeworker | 20000 | ADM |
Good | Driver | 15000 | TRANSPORT |
Smart | Analyst | 32500 | ADM |
After sorting (Department, salary, last_name)
First Name | last_name | salary | Department |
----------------------------------------------------------------------------
Lazy | Officeworker | 20000 | ADM |
Smart | Analyst | 32500 | ADM |
Enormous | Slacker | 45000 | ADM |
Coffee | Drinker | 30000 | DEV |
Efficient | SQL Developer | 30000 | DEV |
Shiny | Outfit | 60000 | DEV |
Good | Teamplayer | 35000 | SALE |
Good | Driver | 15000 | TRANSPORT |
Then the return value from the function (number of microseconds for each operation) :
Sql 1802, Sort 170, Total 1972
.
Optimized sorting
We have optimized the sorting in two different ways to make this as quick as possible. The challenge is the UTF-8 characters in different countries and language settings. To make it sort correctly we have added a command to specify the language settings. The command set_locale
do this. A very central function is to compare strings to determine their order. We have a two-step method for this. First we compare for pure ascii non-utf8 characters. If the order can be determined by this, we dont need the step-two. Usually the compare find the difference in the strings very early, often after det very first characters. If we hit a utf8-multi-byte sequence before the order is determined, we resort to the much more comprehensive compare where we use the locale to determine their order. As this happens rarely this is a valid optimization.
As you can have many grouped arrays to sort together, we make an index that we sort based on the selected criteria and direction first. Then when we have this index, we reorder all the arrays according to the index in the end. In this way, we reduce the need for moving data more than necessary.
The above example, we sorted four arrays with 8 rows in each. The function used only 170 uS, that is pretty fast. We then added some records with utf8 characters in the main sorting criteria. The sort time increased to 1600 uSec. Bøhaugen, Bøsvingen, Borgeresen
- even that is pretty fast. Then we tried the same example on a table with Norwegian postal codes, names, Municipality nr, name, Category, containing 5 arrays of 5146 records in each, with mixed UTF-8 characters in many of the records. The sort time for this was 18 mS. That is also really fast.
SQL Sorting
SQL sorting by ORDER BY
is an easy and effective approach. However, often data comes from other sources, like an import. Also, sometimes some processing of the data is needed before you can sort, so therefore we have the sort_array
Notes:
- It is important to include all the arrays in the group to keep consistency. It is easy to forget new arrays when adding fields to the query.
- Boolean arrays will be sorted as 0 and 1's
- Array types can be STRING, INTEGER, LONG, FLOAT, BOOL, DATE, TIME, TIMESTAMP, JSON, or XML. The JSON and XML type arrays will not be sorted but will follow.
- If the arrays are of different sizes, this will trigger an exception.