Contest Division 1
Contest Division 2
Contest Division 3

Author: Utkarsh Gupta
Tester: Tejas Pandey
Editorialist: Daanish Mahajan




Expectation, Probability, Binomial Theorem, Geometric Series Sum


Chef is playing a game, which he starts with a score of S = 0. He also has an integer N.

In one move, Chef does the following:

  • Uniformly randomly pick an integer X between 0 and 2^N - 1 (inclusive of both ends)
  • Update his score as S \rightarrow S \mid X, where \mid denotes the bitwise OR operation

Chef stops when his score S becomes equal to 2^N-1. What is the expected number of moves Chef performs? Output the answer modulo 10^9 + 7.


Subtask 1 Solution

Define S_i the state in which S has exactly i bits set.
Let E[X_i] denote the expected number of moves to get N bits set if at present exactly i bits are set and P_{i, j} denote the probability of moving from state S_i \rightarrow S_j (where j \ge i since the number of set bits won’t reduce with OR operation).

So we can write:
E[X_i] = 1 + \sum_{j = i}^{N} E[X_j] \cdot P_{i, j}
where E[X_N] = 0 and P_{i, j} = \frac{\binom{N - i}{j - i}}{2 ^ {N - i}}

P_{i, j} explanation

If we have i bits set currently, then N - i bits are unset. Out of these N - i bits we want exactly j - i more bits to set, so we choose the same in \binom{N - i}{j - i} and divide the expression with total possibilities for the N - i unset bits to get the required probability.

Finally, our answer is E[0].

The complexity for the same is \mathcal{O}(N ^ 2) but requires you to precompute factorials, the inverse of factorials, and the inverse of powers of 2.

If you try optimizing the \mathcal{O}(N^2) solution, you will find that there’s no way to do so if we follow the previous approach. Here’s when basics come into the picture.

Main Solution

Answer = \sum_{i = 1}^{\infty} i \cdot P_i where P_i denotes probability that the process ends in exactly i steps.

The equation can be rewritten as: Answer
= (P_1 + P_2 + P_3 + \ldots + P_{\infty}) + (P_2 + P_3 + \ldots + P_{\infty}) + (P_3 + \ldots + P_{\infty}) + \ldots

= \sum_{i = 1}^{\infty} Q_i (where Q_i denotes the probability that the process needs at least i steps before finish)

Q_i explanation

Q_i = 1 - R_{i - 1} (where R_{i - 1} denotes the probability that the process needs atmost i - 1 steps before finish)
Now important observation here to calculate R_{i - 1} is to consider each bit independently. Therefore:
R_{i - 1} = B^{N} where B denotes the probability that the process needs atmost i - 1 steps to set one bit.
So, B = 1 - probability that the bit remains unset after i - 1 steps = 1 - \frac{1}{2^{i - 1}}.

= \sum_{i = 1}^{\infty} 1 - (1 - \frac{1}{2^{i - 1}})^N

= \sum_{i = 1}^{\infty} 1 - (\sum_{j = 0}^{N} \binom{N}{j} (- \frac{1}{2^{i - 1}})^j)

= \sum_{i = 1}^{\infty} \sum_{j = 1}^{N} (-1)^{j + 1} \binom{N}{j} (\frac{1}{2^{i - 1}})^j

= \sum_{j = 1}^{N} (-1)^{j + 1} \binom{N}{j} \sum_{i = 1}^{\infty} (\frac{1}{2^j})^{i - 1}

= \sum_{j = 1}^{N} (-1)^{j + 1} \binom{N}{j} \frac{1}{(1 - \frac{1}{2^j})} (Sum of infinite geometric sequence with first term as a and common difference as r (\lt 1) is \frac{a}{1 - r})

= \sum_{j = 1}^{N} (-1)^{j + 1} \binom{N}{j} \frac{2^j}{(2^j - 1)}

The above summation can be calculated in \mathcal{O}(N) or \mathcal{O}(N \log_2 N).


Setter's Solution
#include <bits/stdc++.h>
#define ll long long int
#define mod 1000000007
using namespace std;
ll power(ll a,ll b) {ll res=1;a%=mod; assert(b>=0); for(;b;b>>=1){if(b&1)res=res*a%mod;a=a*a%mod;}return res;}
ll modInverse(ll a){return power(a,mod-2);}
const int N=1000023;
ll fact[N];
ll invfact[N];
ll inv[N];
void factorialsComputation()
    for(int i=2;i<N;i++)
ll ncr(ll n,ll r)
    ll ans=fact[n]*invfact[r];
    return ans;
void solve()
    ll n;
    ll ans=1;
    for(int i=1;i<=n;i++)
        ll temp = ncr(n,i)*modInverse(power(2,i)-1);
int main()
    int t;
Tester's Solution
#include <bits/stdc++.h>
#define ll long long int
#define mod 1000000007
#define maxn 300007
using namespace std;

ll fact[maxn],ifact[maxn];

ll mpow(ll a, ll b){
	ll res = 1;
		if(b&1) res *= a,res %= mod;
		a *= a;
		a %= mod;
		b >>= 1;
	return res;

void pre(){
	fact[0]  = fact[1] = ifact[0]  = ifact[1] = 1;
	for(int i = 2;i < maxn;i++){
		fact[i] = (fact[i - 1]*i)%mod;
		ifact[i] = mpow(fact[i],mod - 2);

ll comb(ll a,ll b){
	ll ans = (fact[a]*ifact[b])%mod;
	ans *= ifact[a - b];
	ans %= mod;
	return ans;

int main() {
	int t;
	cin >> t;
	while(t--) {
		int n;
		cin >> n;
		ll ans = 1, sign = 1;
    	for(int i = 1; i <= n; i++) {
        	ans += (comb(n, i)*mpow((mpow(2, i) - 1 + mod)%mod, mod - 2))%mod*sign;
        	ans += mod;
        	ans %= mod;
        	sign *= -1;
    	cout << ans << "\n";
	return 0;
1 Like

1+p+p^2…infinity is not equal to 1/(1-p). it approaches it.

if a->b then a%mod not necessarily approach b%mod;

(1+p+p^2…inf)%mod = (1%mod + p%mod + p^2 %mod… )%mod which is NOT EQUAL modinv(1-p)

e.g let a = 1/2
a^inf → 0. as per given solution (a^inf)%mod=0
but (a^inf)%mod = ((a%mod)^inf)%mod = (modInv(2)^inf)%mod whis is NOT equal to zero

We cannot use infinite gp sum without mod in one place and nCr % mod afterwards. Either take everything in mod or nothing.

Author/tester Kindly comment.

1 Like

ersshiva Really good query. From what I think the mod should not be applied (even though it doesn’t break any general rules of modulus ) in GP because of uncertainity we have on what is infinity. I mean while general understanding is sum of positive number is going to positive but we got Ramanujan series which is negative so what is to say (modInv(2)^inf)%mod is not zero instead of large number. In the problem statement though, author must have not considered mod at all but rather just figure out the probability formulae and then applied mod which works fine. But again good query to think about so waiting on comments from author and other intelligent folks.
P.S: These are simply my thoughts and do not claim to have consulted any specific paper for the same so feel free to contradict.

It’s possible to calculate all values for all 1 \leq n \leq N in O(N log ^2 N).

Let’s just look at the formula given in Subtask 1 soln.
Instead of defining E_i as expected moves required when we have i set bits.
Let’s define E_i as expected moves required when we have i unset bits left.
The formula would be -
E_i = 1 + \sum_{j=0}^i E_j{i \choose j} / 2^{i}

We can rearrange this equation to -

E_i * 2^{i}/ i! = 2^{i}/i! + \sum_{j=0}^{i-1} E_j / j! / (j-i)! + Ei/i!
\implies E_i * (2^{i} -1)/ i! = 2^{i}/i! + \sum_{j=0}^{i-1} E_j / j! / (j-i)!
\implies (E_i / i!) = 2^{i}/i!/(2^{i} -1) + \sum_{j=0}^{i-1} 1 / (j-i)! * (E_j / j!)

All one has to do now is to just plug it into an Online FFT solver to find all the values of this recurrence in O(N*log ^2 N).

One can take a look at this submission. It does this in O(N log ^2 N)

I didn’t try optimising it further because TL wasn’t too tight for @neal 's pretty fast FFT implementation for mod 10^9+7. FFT template I had in my library was 4x slower compared to his library.
But I know for the fact that online fft problems can in fact be solved much faster in O(N logN) using Operations on polynomials and series and stuff.
So yeah one should be able to compute all values of E[X_i] in O(N logN) for 1\leq i \leq N

(Not author/tester)

Consider this:
lim_{x \rightarrow inf} (1 + p + p^2 + ... p^x)\%mod .
I think it is pretty obvious that if you write the partial sum series here, it doesn’t converge to any particular value and also doesn’t blow up to inf, so the limit doesn’t actually exist. Thus when you say:

(1+p+p^2…inf)%mod = (1%mod + p%mod + p^2 %mod… )%mod.

the lhs in particular doesn’t really make much sense.
The solution does not suffer from any stuff like this, because it first derives the answer as a finite series without involving any mods, then simply evaluates this finite series in the mod world. The usual rules (a +b)\%mod = (a\%mod + b\%mod)\%mod (and the same for multiplication) hold perfectly well for a finite number of terms, so there shoudn’t be any problem.

Just read the problem statement again. It’s pretty clearly formally and explicitly says what you should output.

The answer would be a fraction of form P/Q, where gcd(Q, 10^9+7) = 1 and then find an integer which we denote by Q^{-1} such that Q * Q^{-1} \% mod = 1, then just print P * Q^{-1} \% mod as an answer.

Ohh. This small detail clears everything. You are right! My bad on going with limit stuff. We don’t need to take mod right away but at end just to express answer.
We don’t need to find E % m but P.inv(q) %m. both are a bit different. But still a learning thing to consider for future.

1 Like