C++ optimization - Picking the optimal algorithm.

Perhaps the single biggest factor, which can make a program run much faster, is replacing an inefficient algorithm with an efficient one. For instance, an efficient sorting algorithm like quick sort, running on a regular laptop is much faster than bubble sort running on a supercomputer, on a large and randomized set of input data.

We measure the time performance of an algorithm, by noting down how much the running time increases by, for incremental increases in the input data size. If the running time remains the same for increasing data sizes, we say the time complexity / performance is constant. If the running time doubles when the input data size doubles, then the time complexity is linear. Similarly we have log, log-linear, quadratic, cubic, exponential time complexities. Algorithms with constant, log, log linear and linear time complexities scale well with input data. Quadratic does not scale well, relatively. Cubic and exponential does not scale well at all and may take a lot of time when input data is huge.

The steps to optimize the algorithm are -
1. Measure / determine time complexity of  the currently used algorithm.
2. Search / create an algorithm with better time complexity
3, Replace the older algorithm with the new one.
4. Measure the performance and tweak it till performance is satisfactory.

Some real world algorithms use a combination of two or more textbook algorithms. For example, std::sort uses introsort. Introsort is a combination of two sorts. It starts off with quicksort, but when the recursion goes too deep, heap sort is used.

There are two most common things which are commonly needs to be done in programs and can be optimized. They are
1. Searching / finding for an element
2. Sorting.


Searching / Finding

Almost all applications/programs does some kind of lookup in the course of it's execution. The most common search techniques are

Linear search - This search examines all the elements in an order, from beginning to the end and compares to see if the element indeed matches our search item. If matched, it ends the search.
std::find and it's variants is commonly used in modern C++ for this purpose.

Binary search - This technique is used on a sorted data-set, and it goes on dividing the data into two parts and recursively searches till the mid element is equal to our search item. In modern C++, std::binary_search is used to determine if the search item is present or not, std::lower_bound is used to find the position of the item.

Interpolation Search - Instead of dividing the input set into two almost equal parts at the center, interpolation search uses knowledge about the input data to determine the more appropriate point of partition. It has better time complexity than binary search. You can read more about it here.

Hashing  -  This has the best time complexity(constant complexity). It stores all the input data using the hash function. When search function is called, the hash is computed on the search element to determine the location of the data.


Sorting 

Sorting is another task commonly done in most applications. Here are some popular sorts -

Bubble sort - The most basic sort algorithm, but with very bad time complexity (n * n).

Quick sort - a algorithm which partitions data-set into two at the partition point, and recursively sorts the subsets. Average complexity is n*Log(n). But worst case compexity is n*n.

Merge sort - Another algorithm which recursively divides the dataset into two until there is no more than one element in a single dataset. From there on ordered merge is applied on the subsets. Average and worst case is n * Log(n)

There are many other sorts like insertion sort, radix sort, bucket sort etc.

By default in a C++ application, we use std::sort and std::stable_sort. But in-order to optimize as much as possible, it is advisable to examine the data and see if we can take advantage of a structure of data to choose the fastest sort mechanism. For instance, if the input data is almost sorted, then a bubble sort or insertion sort might be much faster than std::sort.

Hence to conclude, optimizing the algorithm should be your first step, when we undergo optimization exercise on your program.

This article is inspired by the book Optimized C++. I highly recommend you guys to check it out.








C++ optimization - Picking the optimal algorithm. C++ optimization - Picking the optimal algorithm. Reviewed by zeroingTheDot on July 05, 2018 Rating: 5

No comments:

Powered by Blogger.