Thursday, January 6, 2011

The Sorting problem - PART II


In this session, we will talk about insertion sort. Insertion sort is a sorting technique which has the same running time as the bubble sort.
First the technique behind the insertion sort. The input that is the array of numbers will be iterated and during iteration each element is taken as the key and this number is inserted in the list of elements before this element preserving the non decreasing order property. So the algorithm will start from the second element and compare with the previous element and find its place. then the next iteration and so on. I think now you got why the name as insertion sort.

This is the java implementation of the insertion sort.

public class InsertionSort {

public int[] sort(int []input){


for(int i=1;i<input.length;i++){
int key = input[i];
int j=i-1;
while(j>=0 && input[j]>key){
input[j+1] = input[j];
j--;
}
input[j+1] = key;
   }
return input;
}
public static void main(String[] args) {
new InsertionSort().sort(new int[]{10,23,2,5,23,675,1,23});
}

}

Here you can see that the algorithm consists of a for loop which iterates n-1 times and and while loop which iterates n-1 times. so the running time will be O(n^2).

We can easily prove the correctness of the algorithm by stating the loop invariant as previous terms will be in sorted order. This is true before the first iteration (Initialization) and before the start of every iteration(Maintenance) and at the completion of the algorithm(Termination).

Now lets look at another algorithm which is very similar to the one that we discussed. It is called the selection sort which has the same running time as the insertion sort.
The design of this algorithm is as follows. Sorting n numbers stored in array A by first finding the smallest element of A and exchanging it with the element in first element. Then find the second smallest element of A, and exchange it with second element. Continue in this manner for the first n-1 elements of A.

This is the java implementation of the selection sort.


package sorting;

public class SelectionSort {

public int[] sort(int []input){


for(int i=0;i<input.length-1;i++){
          int j=i+1;
while(j<input.length){
if(input[i]>input[j]){
int temp = input[i];
//swapping the elements
input[i] = input[j];
input[j] = temp;
}
j++;
}
   }
return input;
}
public static void main(String[] args) {
new SelectionSort().sort(new int[]{10,23,2,5,23,675,1,23});
}

}

Here you can see that the algorithm consists of a for loop which iterates n-1 times and and while loop which iterates n-1 times. so the running time will be O(n^2).

We can easily prove the correctness of the algorithm by stating the loop invariant as previous terms will be in sorted order. This is true before the first iteration (Initialization) and before the start of every iteration(Maintenance) and at the completion of the algorithm(Termination).This is the same as the insertion sort.

more to come .. merge sort, quick sort, heap sort

Wednesday, January 5, 2011

The sorting problem - PART I

Let's be clear on the understanding about the sorting. We define problem in terms of input and output.
Here


Input: A sequence of n numbers (a1,a2,......,aN)
Output: A permutation (reordering) (a1',a2',......,aN') of the input sequence such that a1'<= a2'<=aN'.


Yes, Here we are going to deal with a problem of sorting numbers in the non decreasing order.

For solving this problem, one of the simplest algorithm is bubble sort or sinking sort which is easy to code but inefficient. So i have to give a warning "Don't try this in your app!"

Bubble sort, also known as sinking sort, is a simple sorting algorithm that works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order.This is a wiki definition.


Let's look at the algorithm. An important node Here i won't be giving any pseudocode. Instead i will be giving the implementation of the algorithm in Java.



public class BubbleSort { public int[] sort(int []input){ for(int i=0;i<input.length;i++) for(int j=i+1;j<input.length;j++){ if(input[i]>input[j]){ int temp = input[i]; //swap the elements input[i] = input[j]; input[j] = temp; } } return input; } public static void main(String[] args) { new BubbleSort().sort(new int[]{10,23,2,5,23,675,1,23}); } }


Here you need to iterate the array and while iterating for each element you have to find the element in that position. So we have to compare this with every element to check that whether this number is eligible to stay in that position. or for example if the position of the number is 2 in the array. Is this number the second smallest in the array? This is checked by comparing with the remaining elements of the array. This is a straightforward implementation of solving the problem. 


Now let's talk of the design of the algorithm. Here this is designed by the fact that for every ith iteration we will find the ith smallest number. so if the algorithm is in the ith iteration then it should have completed all the 
i-1 small elements.  To find the ith smallest element it just have to compare with the rest of the elements because the i-1 elements have been sorted.


When we analyze the algorithm, we could find that for the given input size 'n', the algorithm will execute the loop n^2 times. since for every element it will check whether it is the smallest in the remaining elements. When we analyze the algorithm we will look for most running section of code.We will find out the no of times each line of code will run and then we take the higher order term. Then that is the worst case running time of the algorithm.So the running time will be O(n^2) in the worst case. 


Now the correctness. To prove the correctness of the algorithm, we have to see the properties of the elements which we already processed before the current iteration. These properties are called loop invariant.
 I will explain this. Here the loop is the two for loops in the code. Here property is that the elements before the current iteration will be in sorted order. To prove the correctness of the algorithm, we have to prove that this property holds for Initialization, Maintenance and Termination section of the code.


Initialization: It is true prior to the first iteration of the loop.



Maintenance: If it is true before an iteration of the loop, it remains true before the
next iteration.


Termination: When the loop terminates, the invariant gives us a useful property
that helps show that the algorithm is correct.


Here we can find that before the first iteration the no of elements already sorted is 0. This proves the Initialization. If we check the elements before every iteration, we can find that the no of elements sorted will be i-1 before ith iteration. This proves the Maintenance.When the loop terminates we can find that the whole array is sorted and the iteration is i+1 (but this iteration won't happen). At this point of time all the i ((i+1)-1) elements are sorted in the array. Thus the correctness of this algorithm is proved.




Coming up.. Insertion sort.



Tuesday, January 4, 2011

Introduction to the algorithms: sorting - part 1

Let me talk some time about what are we going to do. We basically discuss algorithms, its design, its running time (analysis) and its correctness and of course its intention or what are its applications. Informally, an algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output. An algorithm is thus a sequence of computational steps that transform the
input into the output. This is the definition that i got from the text em.. this is straightforward isn't. For new people  just think this as a function containing the computational steps required to convert input to output. For designing and analyzing the algorithms, a fair knowledge of mathematics is required. I will explain as simple as possible.

As i mentioned before we will be talking about design and analysis of algorithms and if possible its correctness. The design will give you an idea of how to design an algorithm for similar form of problem and call this as the algorithm design technique. The analysis will give you the resources that the algorithm will take while its execution. There are many type of resources required while an algorithm runs in a system for example memory, its hardware, time etc. Here we are concentrating only on the time that is how fast can it execute in the system. Its correctness is very important which proves that the algorithms will execute and produce the correct output for any set of inputs. I will try to explain about this as i gets the data.

We will first start with the sorting problem.

First post

Hey this is the author speaking. This is the first post in the data structures and algorithms blog. Through this blog i am planning to jot down mostly about algorithms. I will deal with lot of algorithms and data structures that i learn through the introduction to algorithms (third edition). I am starting with sorting.Please feel free to type you questions and mail those to nishantmc at gmail dot com.