Median

  •  A median is described as the number separating the higher half of a sample, a population, or a probability distribution, from the lower half In probability theory and statistics.
  • The median of a finite list of numbers can be found by arranging all the numbers from lowest value to highest value and picking the middle one.

How to Find median in Arrays:

For getting the median of input array

let int Arr = { 12, 10, 18, 4, 30 }; first sort the array.

We get { 4, 10, 12, 18, 30 } after sorting.

Median is the middle element of the sorted array i.e. 12.

There are different conventions to take median of an array with even number of elements, one can take the mean of the two middle values, or first middle value, or second middle value.

[ad type=”banner”]

Let us see different methods to get the median of two sorted arrays of size n each. Since size of the set for which we are looking for median is even (2n), we take average of middle two numbers in all below solutions and return floor of the average.

Median of two sorted arrays

Method 1 (Simply count while Merging)

Use merge procedure of merge sort. Keep track of count while comparing elements of two arrays. If count becomes n(For 2n elements), we have reached the median. Take the average of the elements at indexes n-1 and n in the merged array.
See the below implementation.
Implementation:

C Programming

// A Simple Merge based O(n) solution to find median of
// two sorted arrays
#include <stdio.h>

/* This function returns median of ar1[] and ar2[].
Assumptions in this function:
Both ar1[] and ar2[] are sorted arrays
Both have n elements */
int getMedian(int ar1[], int ar2[], int n)
{
int i = 0; /* Current index of i/p array ar1[] */
int j = 0; /* Current index of i/p array ar2[] */
int count;
int m1 = -1, m2 = -1;

/* Since there are 2n elements, median will be average
of elements at index n-1 and n in the array obtained after
merging ar1 and ar2 */
for (count = 0; count <= n; count++)
{
/*Below is to handle case where all elements of ar1[] are
smaller than smallest(or first) element of ar2[]*/
if (i == n)
{
m1 = m2;
m2 = ar2[0];
break;
}

/*Below is to handle case where all elements of ar2[] are
smaller than smallest(or first) element of ar1[]*/
else if (j == n)
{
m1 = m2;
m2 = ar1[0];
break;
}

if (ar1[i] < ar2[j])
{
m1 = m2; /* Store the prev median */
m2 = ar1[i];
i++;
}
else
{
m1 = m2; /* Store the prev median */
m2 = ar2[j];
j++;
}
}

return (m1 + m2)/2;
}

/* Driver program to test above function */
int main()
{
int ar1[] = {1, 12, 15, 26, 38};
int ar2[] = {2, 13, 17, 30, 45};

int n1 = sizeof(ar1)/sizeof(ar1[0]);
int n2 = sizeof(ar2)/sizeof(ar2[0]);
if (n1 == n2)
printf("Median is %d", getMedian(ar1, ar2, n1));
else
printf("Doesn't work for arrays of unequal size");
getchar();
return 0;
}

Output

Median is 16

Time Complexity: O(n)

[ad type=”banner”]

Method 2 (By comparing the medians of two arrays)

This method works by first getting medians of the two sorted arrays and then comparing them.

Let ar1 and ar2 be the input arrays.

Algorithm:

1) Calculate the medians m1 and m2 of the input arrays ar1[] 
   and ar2[] respectively.
2) If m1 and m2 both are equal then we are done.
     return m1 (or m2)
3) If m1 is greater than m2, then median is present in one 
   of the below two subarrays.
    a)  From first element of ar1 to m1 (ar1[0...|_n/2_|])
    b)  From m2 to last element of ar2  (ar2[|_n/2_|...n-1])
4) If m2 is greater than m1, then median is present in one    
   of the below two subarrays.
   a)  From m1 to last element of ar1  (ar1[|_n/2_|...n-1])
   b)  From first element of ar2 to m2 (ar2[0...|_n/2_|])
5) Repeat the above process until size of both the subarrays 
   becomes 2.
6) If size of the two arrays is 2 then use below formula to get 
  the median.
    Median = (max(ar1[0], ar2[0]) + min(ar1[1], ar2[1]))/2

Example:

   ar1[] = {1, 12, 15, 26, 38}
   ar2[] = {2, 13, 17, 30, 45}

For above two arrays m1 = 15 and m2 = 17

[ad type=”banner”]

For the above ar1[] and ar2[], m1 is smaller than m2. So median is present in one of the following two subarrays.

   [15, 26, 38] and [2, 13, 17]

Let us repeat the process for above two subarrays:

    m1 = 26 m2 = 13.

m1 is greater than m2. So the subarrays become

  [15, 26] and [13, 17]
Now size is 2, so median = (max(ar1[0], ar2[0]) + min(ar1[1], ar2[1]))/2
                       = (max(15, 13) + min(26, 17))/2 
                       = (15 + 17)/2
                       = 16

Implementation:

// A divide and conquer based efficient solution to find median
// of two sorted arrays of same size.
#include<bits/stdc++.h>
using namespace std;

int median(int [], int); /* to get median of a sorted array */

/* This function returns median of ar1[] and ar2[].
Assumptions in this function:
Both ar1[] and ar2[] are sorted arrays
Both have n elements */
int getMedian(int ar1[], int ar2[], int n)
{
/* return -1 for invalid input */
if (n <= 0)
return -1;
if (n == 1)
return (ar1[0] + ar2[0])/2;
if (n == 2)
return (max(ar1[0], ar2[0]) + min(ar1[1], ar2[1])) / 2;

int m1 = median(ar1, n); /* get the median of the first array */
int m2 = median(ar2, n); /* get the median of the second array */

/* If medians are equal then return either m1 or m2 */
if (m1 == m2)
return m1;

/* if m1 < m2 then median must exist in ar1[m1....] and
ar2[....m2] */
if (m1 < m2)
{
if (n % 2 == 0)
return getMedian(ar1 + n/2 - 1, ar2, n - n/2 +1);
return getMedian(ar1 + n/2, ar2, n - n/2);
}

/* if m1 > m2 then median must exist in ar1[....m1] and
ar2[m2...] */
if (n % 2 == 0)
return getMedian(ar2 + n/2 - 1, ar1, n - n/2 + 1);
return getMedian(ar2 + n/2, ar1, n - n/2);
}

/* Function to get median of a sorted array */
int median(int arr[], int n)
{
if (n%2 == 0)
return (arr[n/2] + arr[n/2-1])/2;
else
return arr[n/2];
}

/* Driver program to test above function */
int main()
{
int ar1[] = {1, 2, 3, 6};
int ar2[] = {4, 6, 8, 10};
int n1 = sizeof(ar1)/sizeof(ar1[0]);
int n2 = sizeof(ar2)/sizeof(ar2[0]);
if (n1 == n2)
printf("Median is %d", getMedian(ar1, ar2, n1));
else
printf("Doesn't work for arrays of unequal size");
return 0;
}

Output :

Median is 5

Time Complexity: O(logn)

Algorithmic Paradigm: Divide and Conquer

Categorized in: