Categories

# When is the Best Sorting Algorithm the best?

I wrote a python code which plots the variations of average time of execution for various input sizes. I took the actual time of execution on my machine rather than the counts as this is exercise is for finding the most practical solution.

There are many a problems in the real world and to each problem, a solution. In computer science, this solution is referred to as an algorithm. However, to a single problem, there often are many a algorithms to achieve the required result. For finding the best solution, we do time complexity analysis(here on referred to as TCA).

For those of you who know the basics of TCA, would know that we select the algorithm which performs best for a very large input size. For example, given below is the graph depicting the run-time of two algorithms A and B  for increasing input sizes.

According to the conventional TCA, the algorithm A is better than the algorithm B, as for large input sizes the average time of execution is less for the algorithm A.

The algorithm A, as we can see, has a linear time complexity, O(n), where as its quadratic for algorithm B, O(n²). In general, in TCA we consider an O(n) algorithm better than a O(n²) algorithm, as for a very large input size (an input size tending to infinity actually), a O(n) algorithm will work better than a O(n²) algorithm.

Now consider the following example, Here again according to TCA, the algorithm C is better than the algorithm D. However, notice that for input sizes less than 10,000 , the algorithm D performed better than the algorithm C.

Note that, an input size of 10,000 is quite big, and might be the actual input size of the problem. With the conventional TCA we would have used the algorithm C for our practical application, when it clearly should have been better to use the algorithm D.

A very common problem in computers, is that of sorting. That is given an input array of numbers, give an output array with increasing/decreasing order of the elements of the input array. For example,

Input: [7,3,8,4,2,5,1,6,9,0]

Output: [0,1,2,3,4,5,6,7,8,9]

There are as we know, quite a few solutions to this problem, you can check out these algorithms here.

Five most common sorting algorithms are

1. Bubble Sort
2. Insertion Sort
3. Selection Sort
4. Quick Sort
5. Merge Sort

The time complexities of the above algorithms are as follows,

We know that quick sort is the best sorting algorithm among all the above five algorithms. For big input sizes, you may checkout this comparison.

However, the question that we are trying to address here, is that after what input size is quick sort the best, and which is the best sorting algorithm for input sizes less than that input size?

I wrote a python code which plots the variations of average time of execution (averaged for 1000 inputs of the same size) for various input sizes. I took the actual time of execution on my machine rather than the counts as this is exercise is for finding the most practical solution.

The reason for a lot of noise on in the graph are, the operating system overheads. It is to be noted that, I closed as many applications and background processes as I could.

In spite of the noise, we still are able to make out that for a large input size what is the order or preference for sorting algorithms.

Quick > Merge > Insertion > Selection > Bubble

I ran multiple runs of the above code for small input sizes as well,

Here we notice that for input sizes <40 insertion sort performs the best out of all the above algorithms. So if you are working on some application which requires sorting of less than input sizes 40, you might want to consider using insertion sort rather than quick sort!

The code that I used for the above analysis can be found in the link,

https://github.com/devarshi16/sorting_comparison

Remember to like this post and start my GitHub repository if this post helped you!

Note that, you need to close as many applications as you can before the execution of the code. Once the code is running do not disturb the computer as it will raise additional overheads. If possible try to run the code while the system is offline.

There are still a lot of things that you can do with this code. Two such things are,

1. Add more sorting algorithms for comparison from the plethora of sorting algorithms available here
2. If you look carefully, you will see that merge sort performs horribly for small input sizes (probably attributed to the fact that it uses extra space for sorting). However, for big input sizes, it is the second best sorting algorithm. When does that happen? Try to find out!

Feel free to comment your opinions and pointing out errors. Thanks!

## 4 replies on “When is the Best Sorting Algorithm the best?”

Thanks! It helped me understand sorting algorithms betteer 🙂

Liked by 2 people

You’re welcome! Glad I could be of help!

Like Berend Kempersays:

check out my util-int-ySort algorithm (roughly 10x faster than qiuckSort on a million size array)
https://github.com/BerendKemper/utilintySort

Liked by 1 person

I needed to thank you for this fantastic read!! I absolutely loved every bit of it. I have got you bookmarked to look at new stuff you post…

Liked by 1 person