Find Median in an Unsorted Array Without Sorting it

Given an unsorted Array A[1,2,3,…N], Find the Medians in this array without sorting it in an efficient Way?

5 Likes

This Problem Can be done is a linear Time O(N),where N=A.length() . Yes ,Selection Algorithm Finds the Median of an unsorted Array without Sorting it.

The Selection Algorithm uses the concept of Quick Sort[But does not actually sort the array though] ,especially the partition Steps.

This algorithm works in two steps. The partitioning step works by picking some pivot element, then rearranging the elements of the array such that everything less than the pivot is to one side, everything greater than the pivot is to the other side, and the pivot is in the correct place.

In a simple words::

In a quick sort ,when you calls the Partition Function, and Lets say the partition Function returns ‘Idx’ , then the job of sorting ‘Idx’ position is Done, that is the A[Idx] contains the same element as the Idx th element of final answers (sorted Array) ,and all element to the left of ‘Idx’ is less than or equal to A[idx].

This is the main concept used by the selection Algorithm.

So to find the Medians basically our aim is to call recursively,the partitions functions unless the Pivot
reached the N/ 2th element.

This is the implementation in C++ :

int A[MAX],N;
int partitions(int low,int high)
{
	int p=low,r=high,x=A[r],i=p-1;
	for(int j=p;j<=r-1;j++)
	{
		if (A[j]<=x)
		{

			i=i+1;
			swap(A[i],A[j]);
		}
	}
	swap(A[i+1],A[r]);
	return i+1;
}
int selection_algorithm(int left,int right,int kth)
{
	for(;;)
	{
		int pivotIndex=partitions(left,right);			//Select the Pivot Between Left and Right
		int len=pivotIndex-left+1;

		if(kth==len)
			return A[pivotIndex];

		else if(kth<len)
			right=pivotIndex-1;

		else
		{
			kth=kth-len;
			left=pivotIndex+1;
		}
	}
}

Run this code calling the function , selection_algorithm(1,N,K),where K=N/2 (For median Elements)from the Main Function.

Note:: Using selection Algorithm ,we can’t only find the Median elements without sorting , We can Find any Kth Smallest element .

Also ,note In partition Function,i selected the Pivot as the last element of Array( x=A[r] ).

You can improve the performance using the Randomized-Partition concept given in CLRS.

Hope this Helps!!!

26 Likes

[Median of Medians][1]http://cs.indstate.edu/~spitla/presentation.pdf
[1]: http://cs.indstate.edu/~spitla/presentation.pdf

int select(int *a, int s, int e, int k) {
if(e-s+1 <= 5)
{
    sort(a+s, a+e);
    return s+k-1;
}

for(int i=0; i<(e+1)/5; i++)
{
    int left = 5*i;
    int right = left + 4;
    if(right > e) right = e;
    
    int median = select(a, left, right, 3);
    swap(a[median], a[i]);
}
return select(a, 0, (e+1)/5, (e+1)/10); }

int main() 
{  
int a[] = {6,7,8,1,2,3,4,5,9,10};
int n = 10;

int mom = select(a, 0, n-1, n/2);
cout<<"Median of Medians: " << a[mom] << endl;
return 0; }
2 Likes

it actually depends on sortedness of data

if data is sorted
use medians of medians because this is worst case for normal quick select it will go in O(n^2) but med-of-med will keep worst case also in O(n) , obviously there will be some overhead of calculating pivot.

if data is partially sorted
use randomised quickselect

if data is unsorted
use normal quickselect (starting from first index as pivot element

choice is to choose from three methods
i) medians of medians
ii)randomised quickselect
iii) normal quickselect

median for even no fo elements (m)= (N+1)/2;(if array starts with index 1)

approch to find median without sorting array -:
chose a random number(pivot) and calculate the no of elements which are greater than equal to pivot element and update right counter® also calculate the no of elements which are less than pivot element and update left counter(l).
if l == m-1 && r == m then mth element is median.

code in c -:

int main()

{

int l,r,m,N,a[20],p;

clrscr();

l=0;

r=0;

m = N;

p = (N+1)/2;

for(i=1;i<=N-1;i++)

{

if(i == m)

continue;

else{

if(a[m]<a[i])

l++;

else

r++; }

if(l == (p-1) && r == p){

printf(“median is %dth locatin %d”,m,a[m]);

break;  }


 m--;
         }

return 0;
}

Compelxity O(N) !

Hope this helps !

I can explain how the code using Quicksort idea O(n) complexity.
We can see, at first time run partition function, we compared n times. In average case, the array will shrink the size down to n/2 (left-right = n/2), and partition function run, we compare n/2 times… Continue, we compared n, n/2, n/4, n/8, … times.
n, n/2, n/4, n/8 is a geometric progression with initial value = n, common ration = 1/2.
SUM(n, n/2, n/4, n/8,…) = lim(t->oo) n*(1-1/2^t)/(1-1/2) = n/(1/2) = 2n.
We averagely compared 2n times, so it’s O(n) complexity.

Use std::nth_element

@ritesh_gupta :: Thanks a lot .Never thought ,there exist a linear Solution for Median Finding :slight_smile:

cpluspluscoder :slight_smile:

I cannot see at first sight that this is O(n) algorithm, especially if it’s inspired by quick sort where worst case is O(n^2), also “Median of Medians algorithm” is O(n), but (if I’m not wrong here) with some big constant O(C*n). When C is 20 or so, O(n*log(n)) algorithm is equally good or better for n <= 1.000.000 …

1 Like

for an array of integers elements, there exists a sorting algorithm in O(n). thus, you can find the median by sorting it and keeping O(n) anyway. ( Counting sort - Wikipedia )

2 Likes

But Counting Sort is inefficient when diff b.w n and max element is really large and is almost O(N^2)so i think Selection Algo is the Best

why is this linear? median of medians divides the whole array into groups of 5 elements then insertion sorts the subarrays…these are then used…i think ritesh your algorithm will be nlogn or since quicksort’s worst case is n^2 it will be On^2 in worst case.

( http://cs.indstate.edu/~spitla/presentation.pdf )

1 Like

How time complexity is O(n) ?