Yesterday i was translating some code i wrote in R to C++. I had some calculation to do which required the list of index of the top `n` values in a list. Therefore what i needed was not a the list sorted itself or get a sorted copy of a given list, but actually the sorted order of the index into the actual array. To describe the problem, for example consider the following list:

arr = 20 50 40 80 10

The sorted (non-decreasing) ordering of the values is obviously

sorted_arr = 10 20 40 50 80

The sorted (non-decreasing) ordering of the index into the original array is (index starts at 1 in this example)

sorted_index_into_arr = 5 1 3 2 4

Therefore the smallest value in the list `arr` could be found by indexing into `arr` using the first value of `sorted_index_into_arr`, which stores the index of the array `arr` holding the smallest value.

arr[ sorted_index_into_arr[1] ]

The sorted index ordering is very easy to get in languages like R and Octave or Matlab. For example in R we can do the following to get the sorted index order using the `order` function:

> arr <- c (20, 50, 40, 80, 10) > arr [1] 20 50 40 80 10 > order (arr) [1] 5 1 3 2 4

In the case of Octave or Matlab you can get the index order using the `sort` function, which will return two lists (1xn matrix), the first list is the sorted array itself, and the next one is the sorted index order into the original array.

octave:1> a = [20 50 40 80 10] a = 20 50 40 80 10 octave:2> [sorted index] = sort (a) sorted = 10 20 40 50 80 index = 5 1 3 2 4

Although these functional languages provides gives these features in C/C++ it is not immediately available using the builtin sort library functions. But the sorting routines accepts a function pointer to a comparison function, writing appropriate code for which will do the trick.

### The Idea

From this point all index will be 0 based. Say, the length of the list is `n`. Initially we will have an index list `idx = ( 0, 1, 2, ..., (n - 1) )` , and an original array with values `arr = (x _{0}, x_{1}, x_{2}, ..., x_{(n - 1)})`. The

`idx`reflects the permutation of the elements in

`arr`. If we were to sort the array itself, we would compare the values of the array

`arr`and then depending on the outcome we would rearrange the values of

`arr`. On the other hand if we need to get the sorted index ordering, we need to sort

`idx`depending on the values of

`arr`. That is, we need to compare the values of

`arr`and depending on the outcome of the result, instead of modifying

`arr`we perform the same modification in

`idx`. When comparing values of

`arr`we need to index it using the values of

`idx`. This is because at all time

`idx`will hold the modified array element permutation of

`arr`. Thus, although

`arr`is not modified itself, the modified

`arr`values can be found by indexing into

`arr`using

`idx`values, which stores the permutation of the index values of original

`arr`, reflecting a modified

`arr`indirectly.

Say after some passes of sorting we have some modified `idx` array, which represents the *partially sorted index order*. In that case for each `0 <= i <= (n - 1)`, `arr[idx[i]]` will give the *partially sorted values*. Basically the contents `idx[i]` tells which is the `i ^{th}` element in the partially modified

`arr`.

To demonstrate i am showing a simple selection sort code to get the sorted index order of a given list. The sort function is given below:

int *sorted_order (const int *arr, int n) { int *idx, i, j; idx = malloc (sizeof (int) * n); for (i=0; i<n; i++) { idx[i] = i; } for (i=0; i<n; i++) { for (j=i+1; j<n; j++) { if (arr[idx[i]] > arr[idx[j]]) { swap (&idx[i], &idx[j]); } } } return idx; }

Note how the indexing is done in the above highlighted lines (16 and 18). In line 16 we compare the actual values of the partially modified array values accessed by `arr[idx[i]]` and then if a wrong ordering is found we swap the index `idx[i]` and `idx[j]` to reflect the change in `arr` and keep `arr` untouched. At any moment the ordering of the values of `arr` is held by the values in array `idx`. Say for the input of `arr = (20, 50, 40, 80, 10)`, we have initially `idx = (0, 1, 2, 3, 4)`. The passes of selection sort are shown below.

arr = (20, 50, 40, 80, 10) Stays unmodified throughout

#### Pass #1

Pass #1 initial idx = (0, 1, 2, 3, 4) Iteration 1 compares: (arr[idx[0]] > arr[idx[1]]) --> (arr[0] > arr[1] ) --> (20 > 50) --> false no interchange Iteration 2 compcares: (arr[idx[0]] > arr[idx[2]]) --> (arr[0] > arr[2] ) --> (20 > 40) --> false no interchange Iteration 2 compares: (arr[idx[0]] > arr[idx[3]]) --> (arr[0] > arr[3] ) --> (20 > 80) --> false no interchange Iteration 2 compares: (arr[idx[0]] > arr[idx[4]]) --> (arr[0] > arr[4] ) --> (20 > 10) --> true swap (idx[0] with idx[4]) updated idx = (4, 1, 2, 3, 0) Therefore new array ordering is arr[idx[0 to 4]] = arr[4, 1, 2, 3, 0] = 10, 50, 40, 80, 20 which is what we wanted.

#### Pass #2

Pass #2 updated idx = (4, 1, 2, 3, 0) Iteration 1 compares: (arr[idx[1]] > arr[idx[2]]) --> (arr[1] > arr[2] ) --> (50 > 40) --> true swap (idx[1] with idx[2]) updated idx = (4, 2, 1, 3, 0) Iteration 2 compares: (arr[idx[1]] > arr[idx[3]]) --> (arr[2] > arr[3]) --> (40 > 80) --> false no interchange Iteration3 compares: (arr[idx[1]] > arr[idx[4]]) --> (arr[2] > arr[0]) --> (40 > 20) --> true swap (idx[1] with idx[4]) updated idx = (4, 0, 1, 3, 2) Therefore new array ordering is arr[idx[0 to 4]] = arr[4, 0, 1, 3, 2] = 10, 20, 50, 80, 40 so we are in the right track.

#### Pass #3

Pass #3 updated idx = (4, 0, 1, 3, 2) Iteration 1 compares: (arr[idx[2]] > arr[idx[3]]) --> (arr[1] > arr[3] ) --> (50 > 80) --> false no interchange Iteration 1 compares: (arr[idx[2]] > arr[idx[4]]) --> (arr[1] > arr[2] ) --> (50 > 40) --> true swap (idx[2] with idx[4]) updated idx = (4, 0, 2, 3, 1) Therefore new array ordering is arr[idx[0 to 4]] = arr[4, 0, 2, 3, 1] = 10, 20, 40, 80, 50 so we are in the right track.

#### Pass #4

Pass #4 updated idx = (4, 0, 2, 3, 1) Iteration 1 compares: (arr[idx[3]] > arr[idx[4]]) --> (arr[3] > arr[1] ) --> (80 > 50) --> true swap (idx[3] with idx[4]) updated idx = (4, 0, 2, 1, 3) Therefore new array ordering is arr[idx[0 to 4]] = arr[4, 0, 2, 1, 3] = 10, 20, 40, 50, 80 so we are in the right track.

#### Finally

Finally we get idx = (4, 0, 2, 1, 3) which is the sorted index order

### Using with sort library functions

We don’t need to write own sorting function in order to get the sorted order index. It will not be good to write your own sorting function, as the libraries already comes with efficient sorting functions, also using the library sort methods will make the code portable. In C we have the `bsort` from `stdlib` and n C++ we have the `std::sort` method from C++ STL. I will describe how to use these to achieve the sorted index ordering.

### Using qsort in C

`qsort` signature is as follows:

void qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *));

It will accept a `void` pointer to the base address of an array of elements of any type in the `base` argument, the number of elements in the array in `nmemb` and the size of each element in `size`. The last element is a function pointer which we need to pass to tell how to compare the two elements, in case you pass strings, structures etc. and also control the sort ordering, non-increasing or non-decreasing. Here is what the manual has to say about the return value of the compare function: *“The comparison function must return an integer less than, equal to, or greater than zero if the first argument is considered to be respectively less than, equal to, or greater than the second”*.

Given these information we need to pass the `idx` array as the first argument and the `nmemb` and `size` as required (`sizeof (type)`.) For our purposes the comparison function should compare the value at the original `arr` indexed by `idx` as we have seen in the above selection sort example. The `qsort` function will call the comparison function using the values in the passed array to it, which is actually pointer to the `idx`. Therefore a comparison function would be like below:

static int compar (const void *a, const void *b) { int aa = *((int *)a), bb = *((int *)b); if (arr[aa] < arr[bb]) return -1; if (arr[aa] == arr[bb]) return 0; if (arr[aa] > arr[bb]) return 1; }

Note that the ugly looking `*((int *)a)` is nothing but typecasting the `void *` pointer to integer first, then dereferencing it to get the value pointed to by the pointer.

Here we compare the values in `arr`, the original array, indexed with the passed integer values in `compar` function, which are actually the values of the `idx` array. There is one issue. Where will we get the `arr` inside the scope of this function? One way is we can make a global or static global pointer pointing to the original array and use it in the compare function.

Here goes the example code.

#include <stdio.h> #include <stdlib.h> /* holds the address of the array of which the sorted index * order needs to be found */ int *base_arr; /* Note how the compare function compares the values of the * array to be sorted. The passed value to this function * by `qsort' are actually the `idx' array elements. */ static int compar (const void *a, const void *b) { int aa = *((int *) a), bb = *((int *) b); if (base_arr[aa] < base_arr[bb]) return -1; if (base_arr[aa] == base_arr[bb]) return 0; if (base_arr[aa] > base_arr[bb]) return 1; } int main (void) { int *arr, *idx, n, i; printf ("\nEnter n: "); scanf ("%d", &n); arr = malloc (sizeof (int) * n); idx = malloc (sizeof (int) * n); printf ("\nEnter the list: "); for (i = 0; i < n; i++) { scanf ("%d", &arr[i]); } /* initialize initial index permutation of unmodified `arr' */ for (i = 0; i < n; i++) { idx[i] = i; } /* Assign the address of out original array to the static global * pointer, this will be used by the compare function to index * into the original array using `idx' values */ base_arr = arr; qsort (idx, n, sizeof (int), compar); printf ("\nOriginal list: "); for (i = 0; i < n; i++) { printf ("%d ", arr[i]); } printf ("\nSorted index: "); for (i = 0; i < n; i++) { printf ("%d ", idx[i]); } printf ("\nSorted array using arr[sorted_idx[i]]: "); for (i = 0; i < n; i++) { printf ("%d ", arr[idx[i]]); } free (arr); free (idx); printf ("\n"); return 0; }

Note how the address of the original array `arr` is assigned to the static global pointer `base_arr` just before the call to `bsort` . The `compare` function will access this `base_arr` pointer to access the original values in `arr` using the values of the `idx` (`a` and `b` in `compar` are actually pointers to elements of `idx`

### Using the C++ STL std::sort method

#### Passing predicate function

In C++ we always do not have the liberty to declare globals. If you are using to `sort` function without any classes around then the way is similar to what we have seen with C. Here we go with an example in this using a function.

#include <iostream> #include <vector> #include <algorithm> using namespace std; static vector < int >*base_arr; bool compar (int a, int b) { return ((*base_arr)[a] < (*base_arr)[b]); } int main (void) { vector < int >arr, idx; int n; cout << "\nEnter n: "; cin >> n; arr.resize (n); idx.resize (n); cout << "\nEnter the list: "; for (int i = 0; i < n; i++) { cin >> arr[i]; } /* initialize initial index permutation of unmodified `arr' */ for (int i = 0; i < n; i++) { idx[i] = i; } base_arr = &arr; sort (idx.begin (), idx.end (), compar); cout << "\nOriginal list: "; for (int i = 0; i < n; i++) { cout << (*base_arr)[i] << " "; } cout << "\nSorted index: "; for (int i = 0; i < n; i++) { cout << idx[i] << " "; } cout << "\nSorted array using arr[sorted_idx[i]]: "; for (int i = 0; i < n; i++) { cout << (*base_arr)[idx[i]] << " "; } cout << endl; return 0; }

#### Passing overloaded () operator class object

There is another way to achieve the same results, which is through functor or function object. Which is a class with the `()` operator overloaded. So we can apply `()` operator on the class object to invoke the overloaded `()` function. In out case we overload the `()` operator being the comparison function, and we pass to sort an instance of this class. In the `sort` function whenever the it will attempt to call the function, actually `()` operator will be applied on the object by the `sort` function it will call the comparator actually. This is the same stuff which we wanted. Here is an example class about which I was talking about.

struct compare_index { const vector<int> base_arr; compare_index (const vector<int> &arr) : base_arr (arr) {} bool operator () (int a, int b) const { return (base_arr[a] < base_arr[b]); } };

In this case we call the `sort` function as below:

sort (idx.begin (), idx.end (), compare_index (arr));

By `compare_index (arr)` this will initialize the value of the original vector into the member variable `base_arr` of `compare_index`, using which we will be able to compare the array values. Thus, this eliminates the global variables as in the previous example.

You can also declare the class to accept any type, for which you can declare it to be as below

templace<class T>struct compare_index { const T base_arr; compare_index (const T arr) : base_arr (arr) {} bool operator () (int a, int b) const { return (base_arr[a] < base_arr[b]); } };

To use it with our example of `vector` use

sort (idx.begin (), idx.end (), compare_index<vector<int> &>(arr));

A sample code is given below, which also achieves the same result.

#include <iostream> #include <vector> #include <algorithm> using namespace std; struct compare_index { const vector < int >base_arr; compare_index (vector < int >&arr):base_arr (arr) { } bool operator () (int a, int b) const { return (base_arr[a] < base_arr[b]); } }; class my_test { public: void test_sorting (void) { vector < int >arr, idx; int n; cout << "\nEnter n: "; cin >> n; arr.resize (n); idx.resize (n); cout << "\nEnter the list: "; for (int i = 0; i < n; i++) { cin >> arr[i]; } /* initialize initial index permutation of unmodified `arr' */ for (int i = 0; i < n; i++) { idx[i] = i; } sort (idx.begin (), idx.end (), compare_index (arr)); cout << "\nOriginal list: "; for (int i = 0; i < n; i++) { cout << arr[i] << " "; } cout << "\nSorted index: "; for (int i = 0; i < n; i++) { cout << idx[i] << " "; } cout << "\nSorted array using arr[sorted_idx[i]]: "; for (int i = 0; i < n; i++) { cout << arr[idx[i]] << " "; } cout << endl; } }; int main (void) { my_test tstobj; tstobj.test_sorting (); return 0; }