GMEDIAN - Editorial(Unofficial)

utkarsh911 · November 12, 2018, 3:48pm

PROBLEM LINK:
Div1
Div2

Setter:Teja Vardhan Reddy

Editorialist: Utkarsh Pandey

DIFFICULTY:
EASY

PRE-REQUISITES:
Observation, Binomial Coefficients.

PROBLEM:
Given a sequence A1,A2,…,AN. Find the number of subsequences in which the median of the subsequence is present in that subsequence, modulo 1000000007.

Explanation
Since we have to sort each subsequence to find its median, we will sort our original sequence (i.e. A1,A2,…,AN) at first itself. The ordering won’t change.
For example, the good (desired) subsequences in the sequence [9,5,7,6,8,1] are same as that in the sequence [1,5,6,7,8,9].

The subsequences having odd length will definitely qualify to be one of our good subsequence (as in the subsequences with odd length, the element at the middle position is the median of the subsequence itself).

The subsequences with even length will qualify as good subsequence if and only if the middle two elements are same. It is so because the array is already sorted. View the content below in case you didn’t understand this statement.

Click to view

Let us suppose the middle elements found in the subsequence B1,B2,…,Bz be x and y.
Now the arithmetic mean of x and y = (x+y)/2 = (let am). am will only exist in the subsequence if there is a number greater than or equal to x or less than or equal to y. Since we have array in the sorted order, this is only possible when x=y, (i.e. both the middle elements are same)

So our answer to this problem is : (Number of subsequences having odd length) + (Number of good subsequences having even length with both the middle elements equal)

First Subtask:
It is clearly written in this subtask: A is a permutation of integers 1 through N. Also, the number of elements in the sequence A is N which means each element from 1 to N is present once and only once. This tells us that there is no repetition of any number. So, no subsequence having even length will be counted.

So, answer to this subtask = Number of subsequences having odd length
= Number of ways of selecting 1 out of n elements + Number of ways of selecting 3 out of n elements + Number of ways of selecting 5 out of n elements + … + Number of ways of selecting n or (n-1) out of n elements

=nC1 + nC3 + nC5 + ..... + nCn (if ‘n’ is odd) or nC1 + nC3 + nC5 + ..... + nC(n-1) (if ‘n’ is even)
where nCx denotes the number of ways of choosing x elements out of n elements

= 2^(n-1)
See the proof below if you didn’t understand this completely
Proof:

Click to view

(1+x)^n = nC0 + nC1*x + nC2*x^2 + nC3*x^3 + ... + nCn*x^n = (let S)
putting x = 1 and x = -1 one by one, we get
2^n = nC0 + nC1 + nC2 + nC3 + ... + nCn 0 = nC0 - nC1 + nC2 - nC3 + ... + (-1)^n * nCn
subtracting these two equations,
2^n= 2(nC1 + nC3 + ... + nCn) (if n is odd) or 2(nC1 + nC3 + ... + nC(n-1))
Therefore, nC1+nC3+nC5+.....+nCn or nC(n-1)
= 2^n / 2
= 2^(n-1)

Second and Third Subtask:
The tricky part of this question is when the elements in the sequence starts repeating. In this case, we have to find the number of subsequences having even length.
To find the subsequences of even length to be a good subsequence, we traverse through each element which is repeated at least once. For this element to be our median element in one of the subsequences of even length, we must have at least k elements at its index’s left and we must have at least k elements at the other element index’s right (Note that our array was sorted), where k is a non-negative integer.
An example is provided in the section below in case you didn’t understand the lines written above:

Click to view

Consider the sequence(A) : [1,2,2,2,2,3].
Here number of elements (N) = 6.
So, number of subsequences having odd length = 2^(N-1) = 2^5 = 32
Now, to count the number of good subsequences of even length(considering 0-based indexing):
Element at index 1 and 2 are same (2)
So, these 2 elements do form an even length good subsequence of length 2 (i.e. [2,2])
Now, number of elements at left of index 1: 1
Number of elements at right of index 2: 3
Since minimum of these two is 1, we can choose 1 element from left of index 1 and 1 element from right of index 2 to form subsequences of length 4

So, number of ways of forming good subsequences of even length: 1C1 * 3C1 = 1 * 3 = 3
which are: [1,2,2,2],[1,2,2,2] and [1,2,2,3],
Again, elements at index 1 and index 3 are same . Repeat this process until the left index reaches n-2 and right index reaches n-1
Following this process, we get number of good subsequences having even length = 22
So, 32 + 22 = 54, which will be our answer to this sequence

Now, Calculating nCx in each iteration, we may have difficulties passing the time limit provided in the problem. To reduce the time complexity to pass the given time limit, we have to observe the given fact:

Let the repeating elements be found at index i and j
Suppose, there are m elements at the left of the index i and n elements at the right of index j.

So, number of good subsequences: 1 (good subsequence of length 2) + mC1 * nC1 (good subsequences of length 4) + mC2 * nC2 (good subsequences of length 6) + … + mCmin(m,n) * nC(min(m,n)).

By Vandermonde’s identity we can sum up this value as (m+n)C(min(m,n)), details of which can be found here :Vandermonde’s Identity.

Bonus:
Calculating queries of the type nCr % 1000000007 can be solved by precalculating nCr % 1000000007 upto n = whatever you need . Details of this can be found here: nCr % p in O(1)

Time Complexity = O(N^2) (Enough to pass all the subtasks)

Editorialist’s Solution: CodeChef: Practical coding for everyone

gaurav_iiitian · November 12, 2018, 8:48pm

Vandermonde’s Identity was the crux to get AC with 100 points. I submitted many solutions but got this identity after a bit google surfing work. Such a soft dismissal. xD

cenation092 · November 12, 2018, 9:36pm

This will (Modulo Inverse For Competitive Programming | nCr % m in O( n ) | Little Fermat Theorem - YouTube ) help in calculating nCr % p.

someshsingh22 · November 13, 2018, 6:36am

I derived the vandermonde’s identity to get this done and still couldnt do it because
“”“Bonus: Calculating queries of the type nCr % 1000000007 can be solved by precalculating upto n = whatever you need . Details of this can be found here: nCr % p in O(1)
“””

ashforever007 · November 13, 2018, 10:07am

You can use DP to find number of sequences (of all lengths) which start and end at a particular element.Then finding number of sequences which use that element for even and odd length becomes easy .CodeChef: Practical coding for everyone

mightymercado · November 13, 2018, 10:39am

Testcases are weak. I solved it in O(N^4)

ducpro · November 13, 2018, 5:25pm

Lol I wrote a brute O(N ^ 3) for each test then optimized it by precalculating an array in O(1000 ^ 3), then each test only takes O(N ^ 2). Only takes 0.3 secs. (Of course, that other thing was just 1000 ^ 3 / 6, which obviously will pass)

anon55659401 · November 13, 2018, 6:02pm

This is my solution which only got 5 points,I did everything right, then why I did not get 30 points? I used the Fermat Modulo Theorem to calculate (n C r) % p
:-
https://www.codechef.com/viewsolution/21540233

savaliya_vivek · November 13, 2018, 7:36pm

My approach same as editorial , but precompte nCr using dp and i think it’s quite easy.
This is my code:
CodeChef: Practical coding for everyone

joffan · November 14, 2018, 2:38am

Using a proof of Vandermonde’s identity to illustrate how to pick all the balanced enclosing sequences around a selected middle pair:

Suppose the chosen two central elements are at positions in the sorted array with s elements before the first one and t elements after the second. Then consider any selection of the same number of elements from each side, say j elements from each side.

Now invert the selection in the “before” section, so a previously selected item becomes unselected and vice versa. This now gives s{-}j elements on that side. Combining this with the “after” section, you have a total of (s{-}j) + j = s elements from the total pool, no matter what j is.

You can easily see that every balanced selection can be turned into a corresponding choice of s elements in this way, and vice versa - every choice of s elements can be turned into a distinct balanced enclosing choice. So the number of options for any one central pair is \binom{s+t}{s}.

example:

a b c d e f g h i j k l m n

chosen central pair h & j (s=7, t=4)

a b c d e f g X - X k l m n

choose say three from each side:

. b c . e . . X - X k l m .

invert the "before"s:

a . . d . f g X - X k l m .

and we have a choice of exactly 7 elements.

And any choice of 7 elements from the 7+4=11 available can produce a unique balanced set:

a . c d e f . X - X k l . .

invert the "before"s:

. b . . . . g X - X k l . .

giving 2 each side.

If you have a run of identical elements (more than two), you can make the summation a little more efficient too.

My code.

eugalt · November 14, 2018, 6:08am

(m+n)C(min(m,n)) is the same as {m+n \choose m}={m+n \choose n} and it can be derived directly, without using Vandermonde’s identity.

We need to get the number of all combinations from m+n elements, m of which are on the left side and n - on the right side, that have equal numbers of elements on each side. Lets replace the left-side part of each combination with its complement - the set of elements on the left side that don’t belong to the original combination. If an original combination had k elements on each side the resulting combination will contain (m-k)+k=m elements. This number does not depend on k and is the same for all the resulting combinations. And if we take any combination of m elements and perform the same operation on its left side, we will obtain a combination with equal number of elements on each side. So we have a one-to-one correspondence, and the number of original combinations is equal to the number of resulting combinations, which is {m+n \choose m}.

Update: joffan essentially describes the same.

rational_kunal · November 14, 2018, 10:29am

Whats wrong with this code

for(long int i=0; i<n-1; i++){
    ans += 1;
    ans += min(right, left);
    if(a[i]==a[i+1]){
        long int right_ = right -1;
        ans += 1;
        ans += min(right_, left);
    }
    left += 1;
    right -= 1;
    ans %= MOD;
}

i am iterating in array and updating the ans. left and right are number of elements in the left and right respectively

rahul1995 · November 15, 2018, 12:17am

What’s wrong in this solution: CodeChef: Practical coding for everyone

wmoise · November 15, 2018, 4:18pm

I used another binomial identity–the hockey stick identity–for an O(n^2) solution.

https://www.codechef.com/viewsolution/21386816

Sums of binomial coefficients where n is changing have a closed form. There doesn’t exist a closed form where k is changing.

\sum_{i=0}^n \binom{i}{k} = \binom{n+1}{k+1}

\sum_{i=m}^n \binom{i}{k} = \binom{n+1}{k+1} - \binom{m}{k+1}

rahul1995 · November 15, 2018, 5:56pm

Guys my code does the exact same thing. Please someone look into this why it’s giving WA.
CodeChef: Practical coding for everyone

lllllllllllll · November 23, 2018, 4:08pm

Since we have to sort each subsequence
to find its median, we will sort our
original sequence (i.e. A1,A2,…,AN) at
first itself. The ordering won’t
change. For example, the good
(desired) subsequences in the sequence
[9,5,7,6,8,1] are same as that in the
sequence [1,5,6,7,8,9].

There is a logical error here. I assume that you mean that the number of good subsequences is independent of the ordering of the sequence. But consider [1, 2, 3, 2, 3] and its ordered counterpart [1, 2, 2, 3, 3]. In the former, [1, 2, 3, 2] and [1, 2, 2, 3] must be counted as two distinct good subsequences, while in the latter, [1, 2, 2, 3] is the only corresponding good subsequence.

I am surprised that this solution gets AC. Either the test cases were weak or I don’t understand the problem correctly.

lllllllllllll · November 23, 2018, 4:22pm

I gather that this solution counts [1, 2, 2, 3] twice for the sequence [1, 2, 2, 3, 3] (in the step where we choose 1 number form the 2 numbers to the right of 2’s, we use 2C1 which counts both the 3’s individually). But, isn’t this wrong? The problem becomes a lot more difficult if we try to take into account that many numbers can be non-distinct and then the Vandermonde’s identity expression is invalidated!

utkarsh911 · November 12, 2018, 5:49pm

@aryanc403 Done

joffan · November 13, 2018, 10:31am

If I was in hinting mood I would just say “Pascal’s triangle”.

More fully, because of the ability to derive binomial coefficients - the entries of Pascal’s Triangle - by summing two earlier values, it’s very quick to generate the values you need in this problem. And of course since we’re just adding, we can take mods as we go.

utkarsh911 · November 13, 2018, 11:58pm

The problem with your code arises in counting the number of good subsequences of even length.
One of your failed test case will be
N=6
A=1 2 2 2 2 3
Output should be 54
Your code’s output : 46
I suggest you to read the editorial completely. I’ve shown the same test case as an example to pass subtask 2