Find Median in an Unsorted Array Without Sorting it

cpluspluscoder · June 30, 2012, 11:07pm

Given an unsorted Array A[1,2,3,…N], Find the Medians in this array without sorting it in an efficient Way?

ritesh_gupta · June 30, 2012, 11:25pm

This Problem Can be done is a linear Time O(N),where N=A.length() . Yes ,Selection Algorithm Finds the Median of an unsorted Array without Sorting it.

The Selection Algorithm uses the concept of Quick Sort[But does not actually sort the array though] ,especially the partition Steps.

This algorithm works in two steps. The partitioning step works by picking some pivot element, then rearranging the elements of the array such that everything less than the pivot is to one side, everything greater than the pivot is to the other side, and the pivot is in the correct place.

In a simple words::

In a quick sort ,when you calls the Partition Function, and Lets say the partition Function returns ‘Idx’ , then the job of sorting ‘Idx’ position is Done, that is the A[Idx] contains the same element as the Idx th element of final answers (sorted Array) ,and all element to the left of ‘Idx’ is less than or equal to A[idx].

This is the main concept used by the selection Algorithm.

So to find the Medians basically our aim is to call recursively,the partitions functions unless the Pivot
reached the N/ 2th element.

This is the implementation in C++ :

int A[MAX],N;
int partitions(int low,int high)
{
	int p=low,r=high,x=A[r],i=p-1;
	for(int j=p;j<=r-1;j++)
	{
		if (A[j]<=x)
		{

			i=i+1;
			swap(A[i],A[j]);
		}
	}
	swap(A[i+1],A[r]);
	return i+1;
}
int selection_algorithm(int left,int right,int kth)
{
	for(;;)
	{
		int pivotIndex=partitions(left,right);			//Select the Pivot Between Left and Right
		int len=pivotIndex-left+1;

		if(kth==len)
			return A[pivotIndex];

		else if(kth<len)
			right=pivotIndex-1;

		else
		{
			kth=kth-len;
			left=pivotIndex+1;
		}
	}
}

Run this code calling the function , selection_algorithm(1,N,K),where K=N/2 (For median Elements)from the Main Function.

Note:: Using selection Algorithm ,we can’t only find the Median elements without sorting , We can Find any Kth Smallest element .

Also ,note In partition Function,i selected the Pivot as the last element of Array( x=A[r] ).

You can improve the performance using the Randomized-Partition concept given in CLRS.

Hope this Helps!!!

panicker · July 3, 2013, 6:09pm

[Median of Medians][1]http://cs.indstate.edu/~spitla/presentation.pdf
[1]: http://cs.indstate.edu/~spitla/presentation.pdf

int select(int *a, int s, int e, int k) {
if(e-s+1 <= 5)
{
    sort(a+s, a+e);
    return s+k-1;
}

for(int i=0; i<(e+1)/5; i++)
{
    int left = 5*i;
    int right = left + 4;
    if(right > e) right = e;
    
    int median = select(a, left, right, 3);
    swap(a[median], a[i]);
}
return select(a, 0, (e+1)/5, (e+1)/10); }

int main() 
{  
int a[] = {6,7,8,1,2,3,4,5,9,10};
int n = 10;

int mom = select(a, 0, n-1, n/2);
cout<<"Median of Medians: " << a[mom] << endl;
return 0; }

road_runner · January 16, 2016, 11:03pm

it actually depends on sortedness of data

if data is sorted
use medians of medians because this is worst case for normal quick select it will go in O(n^2) but med-of-med will keep worst case also in O(n) , obviously there will be some overhead of calculating pivot.

if data is partially sorted
use randomised quickselect

if data is unsorted
use normal quickselect (starting from first index as pivot element

choice is to choose from three methods
i) medians of medians
ii)randomised quickselect
iii) normal quickselect

grv221 · January 17, 2016, 11:55pm

median for even no fo elements (m)= (N+1)/2;(if array starts with index 1)

approch to find median without sorting array -:
chose a random number(pivot) and calculate the no of elements which are greater than equal to pivot element and update right counter® also calculate the no of elements which are less than pivot element and update left counter(l).
if l == m-1 && r == m then mth element is median.

code in c -:

int main()

{

int l,r,m,N,a[20],p;

clrscr();

l=0;

r=0;

m = N;

p = (N+1)/2;

for(i=1;i<=N-1;i++)

{

if(i == m)

continue;

else{

if(a[m]<a[i])

l++;

else

r++; }

if(l == (p-1) && r == p){

printf(“median is %dth locatin %d”,m,a[m]);

break;  }


 m--;
         }

return 0;
}

Compelxity O(N) !

Hope this helps !

ntloi95 · March 21, 2017, 11:30pm

I can explain how the code using Quicksort idea O(n) complexity.
We can see, at first time run partition function, we compared n times. In average case, the array will shrink the size down to n/2 (left-right = n/2), and partition function run, we compare n/2 times… Continue, we compared n, n/2, n/4, n/8, … times.
n, n/2, n/4, n/8 is a geometric progression with initial value = n, common ration = 1/2.
SUM(n, n/2, n/4, n/8,…) = lim(t->oo) n*(1-1/2^t)/(1-1/2) = n/(1/2) = 2n.
We averagely compared 2n times, so it’s O(n) complexity.

mrigendra_7 · August 14, 2017, 12:14pm

Use std::nth_element

cpluspluscoder · June 30, 2012, 11:37pm

@ritesh_gupta :: Thanks a lot .Never thought ,there exist a linear Solution for Median Finding

ritesh_gupta · June 30, 2012, 11:39pm

cpluspluscoder

betlista · July 2, 2012, 3:32pm

I cannot see at first sight that this is O(n) algorithm, especially if it’s inspired by quick sort where worst case is O(n^2), also “Median of Medians algorithm” is O(n), but (if I’m not wrong here) with some big constant O(C*n). When C is 20 or so, O(n*log(n)) algorithm is equally good or better for n <= 1.000.000 …

cyberax · July 2, 2012, 11:45pm

for an array of integers elements, there exists a sorting algorithm in O(n). thus, you can find the median by sorting it and keeping O(n) anyway. ( Counting sort - Wikipedia )

cpluspluscoder · July 4, 2012, 11:22am

But Counting Sort is inefficient when diff b.w n and max element is really large and is almost O(N^2)so i think Selection Algo is the Best

panicker · July 3, 2013, 6:08pm

why is this linear? median of medians divides the whole array into groups of 5 elements then insertion sorts the subarrays…these are then used…i think ritesh your algorithm will be nlogn or since quicksort’s worst case is n^2 it will be On^2 in worst case.

( http://cs.indstate.edu/~spitla/presentation.pdf )

herman · August 6, 2014, 6:26pm

How time complexity is O(n) ?