Getting things sorted

For an assignment for Programming 3 at IGAD, I have to optimize a piece of code. The code transforms, sorts, and renders an amount of sprites. Optimization of the rendering was done in class, so the sorting is the biggest bottleneck at the moment. Which is what I will dedicate this post to. I’ve found some sorting algorithms that I think are very interesting. The ones I will talk about are QuickSort, Radix Sort, and Tree Sort. The algorithm used in the code at first is an unoptimized Bubble Sort, which is very slow.

The sorting algorithm will have to deal with a high amount of sprites to sort, based on their Z-value, which is a floating-point number. The amount of sprites ranges is at least above 5000 and the assignment is to make the code render as many sprites on screen as I can.

Tree Sort

The first decent algorithm I imagined without research was a tree structure, where you would put all data in a binary tree. This automatically sorts the tree, which you could flatten back to an array quite easily. Apparently, this exists and isn’t a terrible solution for sorting. The only problem you face is when you have an unbalanced tree. If you have the maximum value as a first value, anything that gets added will get added to the left child nodes. With a bit of bad luck, this could be the case for a lot of elements, making you traverse every element before adding a new one.

The ease of implementation is what would make this an option for the simulation that I have to optimize. I will not use this one, but I will give example pseudo-code of how I would implement it. It can probably optimized in really simple ways, but since I’m not using this algorithm, I see no need in working on that at the moment. Adding to the tree would be as follows.

[code language=”cpp”]struct TreeNode
{
/*
* I’m using float3-vectors, this could be a Sprite as well.
* The code just draws sprites at positions.
*/
float3 value;
TreeNode* leftNode;
TreeNode* rightNode;

TreeNode(float3 a_value)
: value(a_value), leftNode(nullptr), rightNode(nullptr){}

~TreeNode()
{
if(leftNode) delete leftNode;
if(rightNode) delete rightNode;
}

void Insert(float3& a_other)
{
if(a_other.z <= value.z)
{
if(leftNode == nullptr)
leftNode = new TreeNode(a_other);
else
leftNode->Insert(a_other);
}
else
{
if(rightNode == nullptr)
rightNode = new TreeNode(a_other);
else
rightNode->Insert(a_other);
}
}

//float3 pointer to array, int pointer
void Flatten(float3* a_array, int& a_index)
{
if(leftNode != nullptr)
leftNode->Flatten(a_array, a_index);

a_array[a_index++] = value;

if(rightNode != nullptr)
rightNode->Flatten(a_array, a_index);
}
};

//Setting the first position vector as RootNode
TreeNode* root = new TreeNode(ListOfPositions[0]);

//DOTS is a define which is the number of sprites drawn
for(int i = 1; i < DOTS; i++)
{
root->Insert(ListOfPositions[i]);
}

//Now, to flatten the tree back to an array.
int index = 0;
root->Flatten(ListOfPositions, index);

//Don’t forget to clean up the tree!
delete root;[/code]
I tried this code in the application and quite quickly discovered that this is a slow method of sorting, compared to Quicksort. This might be because it is a completely unbalanced tree. The results vary, but it sometimes uses 100 times the amount of cycles that Quicksort uses.

The first thing I did when I started this assignment was to look for sorting algorithms. There is an awesome visualization of 15 algorithms on YouTube. This video features the Radix Sort (LSD) and Radix Sort (MSD). I will focus on LSD (Least Significant Digit) here, because that is what I tried to implement.

The idea of Radix Sort is to sort numbers in several passes. Every pass sorts the list by a digit in the number. If you have an array with {105, 401, 828, 976}, you can sort on the last digit. {105, 401, 828, 976} would become {401, 105, 976, 828}. After this, you sort on the second digit, making {401, 105, 976, 828} into {401, 105, 828, 976}. The last digit makes {401, 105, 828, 976} into {105, 401, 828, 976}, and we’re done sorting. The beauty of this algorithm is that you can do it without comparing numbers to eachother. I will explain how in a while.

Another cool property of this algorithm is that it is a stable sorting algorithm. This means that if you have two objects in an array with the same comparing value (e.g. {Vector3(100, 20, 3), Vector3(526, -10, 3)} when comparing Z-values), they are guaranteed to appear in the same order in the array. The second element will never be inserted before the first element. This is quite useful, because two objects with the same Z-value and similar X and Y positions might end up Z-fighting, a problem I am facing with Quicksort at the moment.

For this algorithm, you don’t compare values to eachother. The idea is to use buckets to put the objects in. For the previous example, you can have 10 buckets. A bucket per digit (0..9). When you are done filling the buckets, you loop through the buckets, adding all the items in the bucket to an array, making it all sorted on the current radix. For the previous example, this required three passes, one pass per digit. This makes the complexity of Radix Sort O(n*k) where k is equal to the amount of passes. This is very fast, but not applicable to all sorting problems. The good thing is, I can use this to sort floating-point numbers.

Floating-point numbers are difficult to sort this way, since the radices aren’t obvious. What you can do, however, is convert the float to an unsigned integer and using the hexadecimal values to sort them. I will not go in to detail on how to do this, since there are amazing articles by Pierre Terdiman and Michael Herf on Radix Sorting (negative) floating-point numbers. The idea is that most compilers comply to the IEEE 754 standard for representation of floating-point numbers in memory, making the least significant bits in the memory representation also the least significant digits in the floating-point number.

Quicksort

Quicksort is the algorithm I ended up using, since I couldn’t get the Radix Sort to work properly and Quicksort is easier to implement. Quicksort has the advantage that the sorting is in-place, meaning you don’t need additional memory for the sorting process. A downside to Quicksort is that the algorithm isn’t stable, giving Z-fighting issues in this demo.

The concept of Quicksort is to select a pivot point in the array (any random number, the median would be ideal, but costs more time to compute) and put lower numbers to the left of this pivot and higher numbers to the right. When this is done, you partition the array into several smaller partitions to sort the smaller and higher blocks. This process is recursive and ends when the partition size is too small to swap things over the pivot (for example, just the pivot left in the partition would indicate that that partition is sorted, or at least doesn’t require sorting).

Conclusion

This concludes my post about sorting algorithms. The one I picked was Quicksort, since it is easy to understand, implement, and find reference for. It is incredibly fast, but I consider the Z-fighting issues annoying. I’d require a stable sorting algorithm to get rid of that. Radix Sort would be a viable candidate for that, but for the time I had, it took too much to fully comprehend and implement it. The Tree Sort is really easy to implement, since I made up the code as I wrote this post, but it’s slow as well. It could use some clever optimizing, but I wanted to focus on getting the program fast first. The downside to Radix Sort and Tree Sort is that it isn’t in-place, meaning that you need extra memory for sorting the arrays. Memory wasn’t an issue in this assignment, but it can be an issue in future projects, which is why I took it into consideration when picking Quicksort.