Asymptotic Analysis in Practice

Concepts

0_LcMMZTV76P6vD5mc

In this final article we will look at analysing the complexity of an algorithm without the use of asymptotic analysis, and in doing so showcase the benefits of using Big O Notation. Additionally we will look at the everyday practical uses of Big O when developing software applications.

screen-shot-2017-07-16-at-13-08-25.png

The above code doesn’t do anything meaningful. Given an array of integers of n size, iterate through the array, and for any element equal to 5 we simply multiply it by 6 and print it.

Without asymptotic analysis how would we go about evaluating this algorithm for efficiency without performing some kind of benchmark? Well, the simple answer would be to count the amount of instructions executed. The more instructions that have to be executed the more time its going to take for the computer to finish executing that algorithm. Therefore, knowing how many instructions the algorithm is made of seems like a good starting point.

The example can be broken down into the following: a for loop that executes n times, an if statement, a multiplication, and a print statement.

A problem arises though when trying to base our performance on the number of instructions in the algorithm, our algorithm wont always execute all of them. The multiplication and print instructions only occur if the element is 5, and we cannot be sure if, or how many elements in an array will be equal to 5.

Additionally, the code we write is broken down into something that is understood by a computer. The number of instructions it is broken down into depends on the hardware and compiler. Which means our algorithm again is only being tested against a specific hardware/software configuration.

As we have already seen then asymptotic analysis removes these dependencies, allowing us to test independent of any specific implementation. In asymptotic anlaysis the constant values are ignored, this makes sense in our case as this relates to the instuctions themselves, which like we said may be different.

Therefore, we simply need to look at the algorithm and determine its order of growth. The highest power related to its input, in this case it is simply n. We loop around all of the elements once and possibly do something if we have a value of 5.

Now we can use Big O to describe this algorithms upperbound, or worst case scenario. We can say that the algorithm f(n) is worst when every element in the array is equal to 5, as we would have to execute 2 instructions for each element along with executing the if statement. We would end up with a function f(n) = 2n + 1. As we stated though, we are not concerned with constant values and we know the order of growth is n so we can simply say that our algorithm has a growth rate O(n).

Asymptotic analysis allows us to calculate the efficiency of code based on its input. The best use case for big O is when developing scalable algorithms. An algorithm that has to perform well even when its input is huge. A good example of scaleable algorithms are those related to searching for data. These have to work well for 10 elements as well as for a 100,000. We know as n gets bigger it would take more time for the algorithm, therefore an algorithm with a low order of growth will perform a lot better with large input than one with a higher growth rate.

However, this best use case scenario, may not end up being how you use asymptotic analysis. It all depends on what you end up developing. Most developers won’t be writing scalable algorithms, and instead rely on asymptotic analysis for evaluation and decision making. That is asymptotic analysis is useful at identifying bottle necks in a piece of software, and acts as a useful tool when deciding what algorithms or data structures to use to solve a given problem. Without such knowledge you will be less able to identify performance issues and are more likely to write code that causes them.

I hope this article has provided some context on the theory previously discussed, whilst showing pratical uses of asymptotic analysis. Dont worry though as you become a more experienced programmer using Big O will become second nature.

Describing Algorithms with Big O Asymptotic Notation

Concepts

alphabet-2051685_1920.png

In the previous article we looked into techniques for determining an algorithms efficiency, specifically looking into conducting asymptotic analysis of our algorithms. Today we will look at describing our analysis using a specific type of asymptotic notation, Big O. Additionally, we will look into common classifications of algorithms in terms of describing their runtime complexity with Big O.

Asymptotic analysis can help us define the limiting behaviour of a function. For our purposes we can use it to determine the efficiency of our algorithm based on the size of the input. Asymptotic notation helps us describe the analysis of a function.

We can describe the upper bound (Big O), lower bound (Big Omega), or both with Big Theta. The focus of this article is on the discussion of Big O notation, as it is the most commonly used notation when describing algorithms in computer science.

I will save you from suffering through the formal definition of Big O, especially since its not all that helpful for our purposes. If you are interested though, you can read about it on this Wikipedia page.

Big O is a notation that describes the upperbound of our algorithm. The upperbound can be seen as the worse case scenario, measuring against some metric e.g. execution time, memory consumption, etc. The notation for Big O is written O(f(n)) read: “order n” where f(n) describes the largest term within the function.

The biggest term of a function, also known as the highest order, is simply the largest term in that function. Largest describing the term with the highest power. For example, given a function f(n) = 2n^2 + 3n + 5 we can see that the largest term is 2n^2. This means we can describe the function as having an order of growth n^2, O(n^2).

The order of growth of a function is dependent upon its largest term. If f(n) = 2n^2, and g(n) = 2n, we can say that f(n) has a larger order of growth than g(n). As the value of n increases the output of f(n) is going to be much larger than that of g(n).

There are a few things that need to be mentioned before we look at common classifications.

Firstly, just to reiterate Big O is just a form of notation for describing the upperbound of our algorithm. Asymptotic notation in general is just simply a shorthand of describing the behaviour of an algorithm. That is, if we analysed our algorithm in terms of its worst case performance we would use Big O notation denote this behaviour.

Secondly, if an algorithm is described with O(f(n)) we can assume that it will act at worse O(f(n)) but we cannot assume that it will always be of O(f(n)). Therefore, be cautious as O(f(n)) might not tell you all the information you need to know about an algorithm e.g. best/average case scenarios.

Finally, it is worth noting that Big O is a guideline and not a guarantee. For some value n with a large constant c, it might be that a quadractic algorithm may be better than a linear function in cases where n is small. For example, given f(n) = 2,000n + 50 and g(n) = 2n^2 the quadratic algorithm g(n) is more efficient than the linear algorithm f(n) despite the growth of the function being larger.

This means that asymptotic analysis only holds when the value of n is large, and if its not then you cannot assume that a function with a lower order of growth is more efficient than a larger one. Remember asymptotic analysis is only concerned with the largest term, and not its constants or any other terms

With all that out of the way we can look at how to describe certain algorithms in terms of O(f(n)). Below is a few common orders of growth. Each graph shows how execution time is affected as the value of n increases.

Screen Shot 2017-07-09 at 10.24.06.png

O(1) (constant): An algorithm that always requires the same amount of time to execute. Its order of growth is constant. Any single statement of code is constant e.g. print(“Hello World”);, int a = b – c;, etc

Screen Shot 2017-07-09 at 10.20.41

O(log n) (logarithmic): An algorithm in which the time required is dependent upon the log of the input n. Its order of growth is proportional to log n where log is to the base 2. The algorithm will typically take longer to execute as n increases, but after a certain point the input of n wont affect the execution time. An example of an algorithm of O(log n) is a binary search.

Screen Shot 2017-07-09 at 10.19.42

O(n) (linear): An algorithm in which the time required to execute is dependent upon the size of the input n. Its order of growth is proportional to n. That is, as n increases the time taken to execute the algorithm will also grow at the same rate as n. An algorithm that uses a single loop iterating n times. An example of such an algorithm is a linear search.

Screen Shot 2017-07-09 at 10.22.16

O(n^2) (quadratic): An algorithm in which time required is dependent upon the size of the input n squred. Its order of growth is proportional to n^2. That is the execution time will increase dramatically as n gets larger. A typcial exmaple is any algorithm that makes use of two loops, for instance an insertion sort.

To conclude asymptotic analysis is a means of measuring the performance of an algorithm based on its input. Big O is a form of asymptotic notation that denotes the worst case scneario of our algorithm. Hopefully this has all made some sense, but if not don’t worry, the next article will put asymptotic analysis and Big O into practice through some examples.

An introduction to Theoretically Evaluating Algorithm Efficiency with Asymptotic Notation  

Programming

speedometer-653246_1920

Don’t get scared away by the title; what we are going to talk about isn’t all that complicated. In a previous post I introduced the concept of an algorithm, and gave a brief description into how an algorithm is deemed to be efficient. This article will take this further by discussing techniques available to us for testing the efficiency of our algorithms before we go about implementing them in code.

Before we get started, lets first go back over efficiency. An algorithm can be efficient if it meets the memory or running time requirements imposed. Basically, our algorithm must use less than a maximum amount of memory, or run no slower than an amount of time specified. The restrictions imposed are dependent up on the problem we are trying to solve.

In order to test for efficiency, an algorithm must go through a theoretical analysis, using asymptotic analysis, before the algorithm is implemented.

The reason for this theoretical analysis is that simply without it our algorithms could only be tested through implementation.

Why is this bad? Well, firstly, we have to perform the implementation before we have any idea of how the algorithm will run. Meaning you could spend a long time developing something to realise that this algorithm will not run the way you want it to.

Secondly, by testing an algorithm through implementation we are making our algorithm dependent upon a specific platform, programming language, and hardware configuration. Altering any of these variables could result in a different result. Given the shear amount of variation we could never test our algorithm for all possible configurations.

Having a way of analysing our algorithm before we start implementing it allows us to save time, but more importantly allows us to judge efficiency independent of any hardware or software.

As described by Wikipedia, asymptotic analysis is the field of mathematics for describing the limiting behaviour of functions. A limit of a function is the value a function approaches as the input of that function approaches some value, usually zero or infinity.

Therefore, we are looking at the output of our function against a specific value, based on the values we are passing into the function.

If we have the function f(x) = e^x we could look at the output of that function as x tends towards infinity. Basically our function output grows exponentially as the value of x gets larger.

Asymptotic notation is a language that describes the type of behaviour of a function respective to the growth of that function. What I mean by this is that given a function f(n) = 2n^2 + 600n + 200 we are only concerned with the most significant term, n^2, because as n tends towards infinity the other terms and constants become irrelevant, as shown in the graph below.

As you can see from the graph the n^2 term results in a significantly larger output as the input size increases.

There are a few different types of notation and in the next article we will go into a lot more detail about one of them, but for now lets talk about how all this relates back to algorithms.

This idea can be applied to our algorithms, whereby the input of our function is the size of our input of the algorithm. Input is the metric we use as algorithms are designed to work with inputted data because an algorithm is useless without it. A search algorithm requires elements in which to search, as does a sorting algorithm needs input to sort.

As the input increases in size we can see that an algorithm might take longer to complete, or require more memory. It would take a lot less CPU cycles, or steps to search through a 100 items as it would do to search through 100,000.

This leads us onto the output of our function, which is what we want to measure for within our algorithm. If we are measuring the time it takes to run then we would like to see how long our algorithm takes to complete as the input amount increases. If we want to measure against memory, we would want to see how much memory is used up as the amount of input increases.

Therefore asymptotic analysis is required to measure the running time or memory capacity required by our algorithms as the input size increases. Asymptotic notation is where we describe our function as a rate of growth using the most significant term, and removing any insignificant terms or constants. We end up with an independent method for determing the efficiency of an algorithm.

In the next article we will look at a specific form of asymptotic notation, Big O notation, which is commonly used in computer science for measuring an algorithms performance.