# How the usort() sorting algorithm works?

Viewed   94 times

I have an usort() example and I added some echo statements to see how the code works:

``````<?php
function list_cmp(\$a, \$b) {
global \$order;
echo "\$a=\$a, \$b=\$b </br>";

foreach (\$order as \$key => \$value) {
echo "\$value=\$value </br>";
if (\$a == \$value) {
echo "\$a=\$value, returing 0. </br>";
return 0;
}
if (\$b == \$value) {
echo "\$b=\$value, returing 1. </br>";
return 1;
}
}
}

\$order = 1;
\$order = 3;
\$order = 4;
\$order = 2;

\$array = 2;
\$array = 1;
\$array = 3;
\$array = 4;
\$array = 2;
array = 1;
\$array = 2;

usort(\$array, "list_cmp");
?>
``````

The output of the code is this:

``````\$a=2, \$b=1
\$value=1
\$b=\$value, returing 1.
\$a=2, \$b=3
\$value=1
\$value=3
\$b=\$value, returing 1.
\$a=1, \$b=3
\$value=1
\$a=\$value, returing 0.
\$a=2, \$b=4
\$value=1
\$value=3
\$value=4
\$b=\$value, returing 1.
\$a=3, \$b=4
\$value=1
\$value=3
\$a=\$value, returing 0.
\$a=2, \$b=2
\$value=1
\$value=3
\$value=4
\$value=2
\$a=\$value, returing 0.
\$a=2, \$b=1
\$value=1
\$b=\$value, returing 1.
\$a=2, \$b=1
\$value=1
\$b=\$value, returing 1.
\$a=4, \$b=1
\$value=1
\$b=\$value, returing 1.
\$a=3, \$b=1
\$value=1
\$b=\$value, returing 1.
\$a=1, \$b=1
\$value=1
\$a=\$value, returing 0.
\$a=2, \$b=2
\$value=1
\$value=3
\$value=4
\$value=2
\$a=\$value, returing 0.
``````

What is the mechanism of creating the 12 \$a-\$b pairs - 2-1, 2-3, 1-3, 2-4, 3-4, 2-2, 2-1, 2-1 (the same again?), 4-1, 3-1, 1-1, 2-2. The above code returns 1,1,0,1,0,0,1,1,1,1,0,0. And also what is the mechanism of sorting the array, based on the values that get returned? I am trying to understand how the usort() mechanism works.

Thanks.

2
1. How does the comparator work?

I'm not sure wether this was part of the question, but to be clear about how the comparator works: You have an order specified by ordered list `\$order` and a special comparator callback `list_cmp` which (should) return wether argument

• `\$a` is smaller than `\$b` (`return -1` or a value `< 0`)
• `\$a` greater than `\$b` (`return 1` or a value `> 0`)
• `\$a` equal `\$b` (`return 0`)

`list_cmp` does this by looking up its order table and rather checking if

• `\$a` has a smaller (or equal) order in which case the loop exits early with `return 0` or if
• `\$b` has a smaller order in which case the loop exits early with `return 1`.

Be aware that this is wrong according to the PHP Documentation which states it wants positive/negative/0 as return values. This is only correct if you know that the internals only check for `comparator(\$a,\$b) > 0`, meaning it only checks if `\$b` is smaller than and not equal to `\$a`, making this a comparison `order of \$a <= order of \$b`. It can easily break if the code starts to check for `\$a` being smaller than and not equal to `\$b`.

1. How does the quicksort sorting work?

For starters I'm assuming you are using PHP 7 or higher. In that case you hit a special case with 6-15 Element sized arrays. PHP 7+ does not seem to use quicksort for short lists, instead it uses an Insertion Sort Variant (which is most probably faster due to Hardware related stuff like caching/code locality). You can check the sorting source code f.e. on the Github PHP Mirror (as an example: PHP 7.0 Zend/Zend_sort.c Line 177-198).

The code works in 3 steps:

1. comparision: compare neighbor elements `array[j]` and `array[j+1]`, if `array[j] <= array[j+1]` move on, else goto 2.
2. find insertion point: now if `array[j] > array[j+1]`, scan backwards to find the point where `array[x] < array[j+1] <= array[x+1]` for `x < j` (obviously only until `x` hits the start)
3. insert: shift elements `x+1 ... j` one up such that it becomes `x+2 ... j+1` and insert former element at position `x+1`

If you apply that code to the pairings (2-1, 2-3, 1-3, 2-4, 3-4, 2-2, 2-1, 2-1, 4-1, 3-1, 1-1, 2-2) it gets obvious what the code does.

``````-- [2,1],3,4,2,1,2 -> 1./2./3. compare [2,1], find and insert 1 before 2
-- 1,[2,3],4,2,1,2 -> 1./2. compare [2,3], find insert point for 3 (since order of 3 < order of 2)
-- [1,3],2,4,2,1,2 -> 3. compare [1,3], found insert point for 3 before 2
-- 1,3,[2,4],2,1,2 -> 1./2. compare [2,4], find insert point for 4 (since order of 4 < order of 2)
-- 1,[3,4],2,2,1,2 -> 3. compare [3,4], found insert point for 4 before 2
-- 1,3,4,[2,2],1,2 -> 1. compare [2,2], skip
-- 1,3,4,2,[2,1],2 -> 1./2. compare [2,1], find insert point for 1
-- 1,3,4,[2,1],2,2 -> 2. compare [2,1], find insert point for 1
-- 1,3,[4,1],2,2,2 -> 2. compare [4,1], find insert point for 1
-- 1,[3,1],4,2,2,2 -> 2. compare [3,1], find insert point for 1
-- [1,1],3,4,2,2,2 -> 3. compare [1,1], fond insert point for 1 before 3
-- 1,1,3,4,2,[2,2] -> 1. compare [2,2], skip
-- sorted: 1,1,3,4,2,2,2
``````

PS: Here you already see that it is quite complex to derive the work of even a simple sorting algorithm (22 lines of code) by its comparison patterns. The PHP 7 quicksort implementation is around 10 times as big in lines of code and has some freaky optimizations (besides normal madness due to pivot selection and recursions).

Most of the times it is better to ignore in depth implementation details and only reduce it to the stuff that is needed. Typical questions for a sorting algorithm would be if it is stable/unstable and performing in `O(log n)` with `O(n)` Memory consumption. There are easier ways to learn the core algorithms behind those optimized implementations like the Quicksort Dance or any other visualization or a good old (e)book or webpage with examples.

-- Edited

Added a (bad, unoptimized, unsafe) php implementation for insertion sort for another visualization of how it works:

``````<?php

function my_usort(\$A, \$comparator) {
// Start .. End Positions
\$current_pos = 0;
\$last_pos = count(\$A)-1;
// Outer Loop: each step checks that A up to A[current_pos] is sorted.
// When the algorithm finishes we know that A ... A[last_pos] is sorted
while(\$current_pos < \$last_pos) {
echo "Sorted Subarray from \$A is " . json_encode(array_slice(\$A, 0, \$current_pos+1)) . "<br>n";
echo "\$A looks like this now: " . json_encode(\$A) .
", comparing [" . \$A[\$current_pos] . "," . \$A[\$current_pos +1] . "] (verify step)<br>n";
// "Verification Step"
// At this point A ... A[current_pos] is sorted.
// Check A[current_pos] <= A[current_pos +1]
if(\$comparator(\$A[\$current_pos], \$A[\$current_pos +1]) > 0) {
// nope, A[current_pos] > A[current_pos +1] (list_cmp/comparator returns value > 0)
// "Insertion Step" start, find the correct position for A[current_pos+1] in the already
// sorted list A ... A[current_pos]
\$insert_point = \$current_pos;
// Swap the missmatching Neighbor pair
echo "swapping: \$A[" . \$insert_point . "] and \$A[" . (\$insert_point+1) . "]<br>n";
\$tmp = \$A[\$insert_point +1];
\$A[\$insert_point +1] = \$A[\$insert_point];
\$A[\$insert_point] = \$tmp;
\$sorted_up_to_current_pos = false;
// Inner Loop: find correct insertion point
while(\$insert_point > 0 && !\$sorted_up_to_current_pos) {
echo "\$A looks like this now: " . json_encode(\$A) .
", comparing [" . \$A[\$insert_point-1] . "," . \$A[\$insert_point] . "] (insertion step)<br>n";
// "Insertion Step", Swap the missmatching Neighbor pairs until A ... A[current_pos] is sorted again
if(\$comparator(\$A[\$insert_point-1], \$A[\$insert_point]) > 0) {
// Swap the missmatching Neighbor pair
echo "swapping: \$A[" . (\$insert_point-1) . "] and \$A[" . \$insert_point . "]<br>n";
\$tmp = \$A[\$insert_point];
\$A[\$insert_point] = \$A[\$insert_point-1];
\$A[\$insert_point-1] = \$tmp;
// goto new pair
\$insert_point = \$insert_point -1;
} else {
// found correct spot, done
\$sorted_up_to_current_pos = true;
}
}
\$A[\$insert_point] = \$tmp;
echo "\$A looks like this now: " . json_encode(\$A) . ", insertion done<br>n";
}
\$current_pos = \$current_pos + 1;
}
echo "Sorted Array \$A is " . json_encode(array_slice(\$A, 0, \$current_pos+1)) . "<br>n";
}

function list_cmp(\$a, \$b) {
global \$order;
//echo "\$a=\$a, \$b=\$b </br>n";

foreach (\$order as \$key => \$value) {
//echo "\$value=\$value </br>n";
if (\$a == \$value) {
echo "\$a=\$value, returing 0. </br>n";
return 0;
}
if (\$b == \$value) {
echo "\$b=\$value, returing 1. </br>n";
return 1;
}
}
}

\$order = 1;
\$order = 3;
\$order = 4;
\$order = 2;

\$array = 2;
\$array = 1;
\$array = 3;
\$array = 4;
\$array = 2;
\$array = 1;
\$array = 2;

my_usort(\$array, "list_cmp");
``````

Output is complete now with currently sorted array, positions:

``````Sorted Subarray from \$A is 
\$A looks like this now: [2,1,3,4,2,1,2], comparing [2,1] (verify step)
\$b=\$value, returing 1.
swapping: \$A and \$A
\$A looks like this now: [1,2,3,4,2,1,2], insertion done
Sorted Subarray from \$A is [1,2]
\$A looks like this now: [1,2,3,4,2,1,2], comparing [2,3] (verify step)
\$b=\$value, returing 1.
swapping: \$A and \$A
\$A looks like this now: [1,3,2,4,2,1,2], comparing [1,3] (insertion step)
\$a=\$value, returing 0.
\$A looks like this now: [1,3,2,4,2,1,2], insertion done
Sorted Subarray from \$A is [1,3,2]
\$A looks like this now: [1,3,2,4,2,1,2], comparing [2,4] (verify step)
\$b=\$value, returing 1.
swapping: \$A and \$A
\$A looks like this now: [1,3,4,2,2,1,2], comparing [3,4] (insertion step)
\$a=\$value, returing 0.
\$A looks like this now: [1,3,4,2,2,1,2], insertion done
Sorted Subarray from \$A is [1,3,4,2]
\$A looks like this now: [1,3,4,2,2,1,2], comparing [2,2] (verify step)
\$a=\$value, returing 0.
Sorted Subarray from \$A is [1,3,4,2,2]
\$A looks like this now: [1,3,4,2,2,1,2], comparing [2,1] (verify step)
\$b=\$value, returing 1.
swapping: \$A and \$A
\$A looks like this now: [1,3,4,2,1,2,2], comparing [2,1] (insertion step)
\$b=\$value, returing 1.
swapping: \$A and \$A
\$A looks like this now: [1,3,4,1,2,2,2], comparing [4,1] (insertion step)
\$b=\$value, returing 1.
swapping: \$A and \$A
\$A looks like this now: [1,3,1,4,2,2,2], comparing [3,1] (insertion step)
\$b=\$value, returing 1.
swapping: \$A and \$A
\$A looks like this now: [1,1,3,4,2,2,2], comparing [1,1] (insertion step)
\$a=\$value, returing 0.
\$A looks like this now: [1,1,3,4,2,2,2], insertion done
Sorted Subarray from \$A is [1,1,3,4,2,2]
\$A looks like this now: [1,1,3,4,2,2,2], comparing [2,2] (verify step)
\$a=\$value, returing 0.
Sorted Array \$A is [1,1,3,4,2,2,2]
``````
Tuesday, October 18, 2022

4

The function `cmp` itself doesn't do the sorting. It just tells `usort` if a value is smaller, equal or greater than another value. E.g. if `\$a = 5` and `\$b = 9` it will return 1 to indicate that the value in `\$b` is greater than the one in `\$a`.

Sorting is done by `usort`.

Wednesday, October 26, 2022

5

To sort anything you need a means to compare two items and figure out if one comes before the other. This is what you supply to usort. This function will be passed two items from your input array, and returns the order they should be in.

Once you have a means to compare two elements, you can use sort-algorithm-of-your-choice.

If you are unfamiliar, you might like to look at how a simple naive algorithm like bubblesort would use a comparison function.

Behind the scenes, PHP is using a quicksort.

Friday, November 18, 2022

4

Aha, a case for the Schwartzian Transform.

It basically consists of three steps:

1. decorate; you turn every value into an array with the value as the first element and the key/index as the second
2. sort (as per normal)
3. undecorate; you reverse step 1

Here it is (I've tweaked it to your particular use case):

``````function decorate(&\$v, \$k)
{
\$v['authn_weight'] = array(\$v['authn_weight'], \$k);
}

function undecorate(&\$v, \$k)
{
\$v['authn_weight'] = \$v['authn_weight'];
}

array_walk(\$a, 'decorate');
usort(\$a, 'weightSortImplementation');
array_walk(\$a, 'undecorate');
``````

The trick is in the following assertion:

``````array(\$x, 0) < array(\$x, 1)
``````

This is what keeps the correct order of your array. And, no recursion required :)

Sunday, November 13, 2022

5

I guess after such a long time you must have got an answer. But I am still providing some example links to help someone else who hits this question.

NOTE: Before looking into this link you should have an idea about Heap data structure Take a look at Example of Two-Way Sorting and Example of multiway external sorting and you will get a complete idea of the implementation of a external sorting algorithm

Tuesday, November 29, 2022