EVVAL - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: raysh07
Tester: sushil2006
Editorialist: iceknight1093

DIFFICULTY:

Easy-Medium

PREREQUISITES:

Linearity of expectation

PROBLEM:

You’re given an array A.
Consider the following random process:

  1. Choose a random integer x between 1 and N, and set L = R = x, \text{score} = A_x.
  2. Repeat N-1 times:
    • If L = 1, increase R by 1.
    • If R = N, decrease L by 1.
    • Otherwise, with equal probability, decrease L by 1 or increase R by 1.
    • Let y denote the border that changed. Add A_y\cdot (R-L+1) to \text{score}.

Find the expected final value of \text{score}.

EXPLANATION:

There’s a couple of different ways to approach this problem, I’ll present two of them.
They both use linearity of expectation, but the first one uses some symmetry arguments to avoid having to do much algebra work, while the second works purely with algebra (though it’s not really messy either).

Solution 1

One way of looking at the A_y\cdot (R-L+1) term is that A_y is added to the score once for each element that exists in the subarray when y first enters it.

That is, for a pair of indices (i, j),

  • If i is present in the subarray before j, we add A_j to the score once.
  • If j is present in the subarray before i, we add A_i to the score once instead.

Let X_{i, j} be an indicator variable denoting the event “i is already present in the subarray when j is first added to it”.
In particular, X_{i, i} = 1 for all i.

Then, the score of a process is exactly

\sum_{i=1}^N \sum_{j=1}^N X_{i, j} A_j

The expectation of this quantity, applying linearity of expectation, is

\sum_{i=1}^N \sum_{j=1}^N p(i, j) A_j

where p(i, j) denotes the probability that i is present in the subarray before j.

Let’s now try to compute p(i, j).
Suppose i \lt j. Then, conditioning on the starting index x,

  • If 1 \leq x \leq i, it’s always true that i will be present before j because of how the subarray expands.
    There are i choices here, each with a \frac{1}{N} chance of happening.
  • If j \leq x \leq N, i can never be present before j.
  • If i \lt x \lt j, it’s a bit harder to analyze what’s going on - depending on where x lies in relation to i and j, the probability that i will be reached before j will differ.

However, in the third case, note that there’s some symmetry across starting positions - specifically, we have reflection symmetry about the point i+j.

That is, for any starting position x, we look also at the starting position i+j-x.
For any sequence of moves starting from x that reaches i before j, if we start instead at i+j-x and replace left moves with right ones (and vice versa), we obtain a sequence of moves (with the same probability) that reaches j before i instead.

This establishes a probability-preserving bijection between sequences that start in the middle and reach i before j, and sequences that start in the middle but reach j before i.
Any sequence must do one of these, so their total probability is 1; meaning the probability that a sequence starts in the middle and reaches i before j is exactly \frac12.

There’s a \frac{j-i-1}{N} probability that the sequence starts in the middle in the first place, for a total probability of \frac{j-i-1}{2N} here.

So, for a fixed i \lt j, we obtain

p(i, j) = \frac{i}{N} + \frac{j-i-1}{2N}

Further, note that p(i, j) + p(j, i) = 1 for i \lt j, and p(i, i) = 1 for all i.


Now, let’s fix j and try to compute the multiplier of A_j, i.e, add up all p(i, j).

p(j, j) = 1.
For i \lt j, we have p(i, j) = \frac{i}{N} + \frac{j-i-1}{2N}.
Adding this up across all i \lt j,

  • For the first term, we only want to know the sum 1 + 2 + \ldots + (j-1), which is \frac{j\cdot (j-1)}{2}.
  • For the second term, note that it’s again just 0+1+2+\ldots + (j-2), which is \frac{(j-1)\cdot (j-2)}{2}.

j \gt i can be handled similarly, using the fact that p(i, j) = 1 - p(j, i).
So, the sum of all p(i, j) can be found in constant time; repeating this for all j gives us a linear time solution.



Solution 2

Let q(i, j) denote the probability that at some point of the process, the range we have is exactly [i, j].
Note that q(i, i) = \frac{1}{N} since we have an equal probability of starting at any point.

Now, for i \lt j, the range [i, j] is obtained by extending either [i+1, j] or [i, j-1].
Each extension has a \frac12 chance of happening if i \gt 1 and j \lt N; otherwise only one of them is possible.
So, for 1 \lt i \lt j \lt N, we obtain:

q(i, j) = \frac{1}{2}\left(q(i+1, j) + q(i, j-1)\right)

Along with q(i, i) = \frac{1}{N} for all i, it’s easy to show from here that q(i, j) = \frac{1}{N} as well; since q(i, j) = \frac{1}{2N} + \frac{1}{2N}.
This can be formally proved by inducting on the length of the range, i.e, on (j-i+1).

That leaves only the q(1, i) and the q(i, N) values.
These are similar in nature; in particular, note that

q(1, i) = q(1, i-1) + \frac12 q(2, i) = q(1, i-1) + \frac{1}{2N}

since q(2, i) = \frac{1}{N} as we know from above.

Along with q(1, 1) = \frac{1}{N}, we obtain q(1, i) = \frac{1}{N} + \frac{i-1}{2N} = \frac{i+1}{2N}.
Similarly, q(i, N) = \frac{N-i+2}{2N}

(Note that these are technically not true for q(1, N), which is 1 since it always appears in the end.)


Let’s now try to compute the answer.
Fix an index i, and let’s look at when A_i becomes the left endpoint of its segment.

That is, consider the segment [i, j] for j \geq i.
For i to become the left endpoint of this segment, the segment [i+1, j] should exist; and then extend one step to the left.

When j \lt N, the segment has a \frac{1}{N} chance of existing, and a \frac12 chance of extending right.
When j = N, the segment has a \frac{N-i+1}{2N} chance of existing, and will always propagate left.

So, fixing i and varying j \gt i, we want to compute

\sum_{j=i+1}^{N-1} \left(A_i \cdot (j-i+1) \cdot \frac{1}{2N} \right) + A_i\cdot (N-i+1)\cdot\frac{N-i+1}{2N}

\frac{A_i}{2N} is a constant here, so it can be taken out.
That leaves the summation of (j-i+1) as the only non-trivial term - however, note that this is just 2 + 3 + \ldots + (N-i) which is easy to compute in constant time.

This takes care of the case when i first becomes the left endpoint of the range; similar computations can be performed for when it’s the right endpoint by considering j \lt i.

Add this up across all i to obtain the final answer.

TIME COMPLEXITY:

\mathcal{O}(N) per testcase.

CODE:

Author's code (C++)
#include <bits/stdc++.h>
using namespace std;
#define int long long
#define INF (int)1e18
#define f first
#define s second

mt19937_64 RNG(chrono::steady_clock::now().time_since_epoch().count());
const int mod = 1e9 + 7;

long long power(int x, int y){
    if (y==0) return 1;
    
    long long v = power(x, y/2);
    v *= v; v%= mod;
    
    if (y&1) return v * x % mod;
    else return v;
}


void Solve() 
{
    int n; cin>>n;
    vector <int> a(n + 1);
    for (int i=1; i<=n; i++) cin>>a[i];
    
    int in = (mod + 1)/2;
    
    int ans = 0;
    for (int i=1; i<=n; i++){
        //consider how many times a[i] will be added 
        int v = n;
        
        //consider j < i, and you want to calculate contribution 
        v += (i * (i - 1))/2; v %= mod;
        v += ((n - i) * (n - i + 1))/2; v%= mod;
        
        //now consider all n - 1 possible js 
        //among the ones greater you get a sequence 0, 1, 2, ..., i - 2
        //among the ones smaller you get a sequence 0, 1, 2, ...
        
        int v1 = 0;
        v1 += ((i - 2) * (i - 1))/2;
        v1 += ((n - 1 - i) * (n - i))/2;
        v1 %= mod;
        
        v += v1 * in % mod; v %= mod;
        
        ans += v * a[i] % mod; ans %= mod;
    }
    ans *= power(n, mod - 2); ans %= mod;
    
    cout<<ans<<"\n";
}

int32_t main() 
{
    auto begin = std::chrono::high_resolution_clock::now();
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    int t = 1;
    cin >> t;
    for(int i = 1; i <= t; i++) 
    {
        //cout << "Case #" << i << ": ";
        Solve();
    }
    auto end = std::chrono::high_resolution_clock::now();
    auto elapsed = std::chrono::duration_cast<std::chrono::nanoseconds>(end - begin);
   // cerr << "Time measured: " << elapsed.count() * 1e-9 << " seconds.\n"; 
    return 0;
}
Tester's code (C++)
#include <bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>

using namespace std;
using namespace __gnu_pbds;

template<typename T> using Tree = tree<T, null_type, less<T>, rb_tree_tag, tree_order_statistics_node_update>;
typedef long long int ll;
typedef long double ld;
typedef pair<int,int> pii;
typedef pair<ll,ll> pll;

#define fastio ios_base::sync_with_stdio(false); cin.tie(NULL)
#define pb push_back
#define endl '\n'
#define sz(a) (int)a.size()
#define setbits(x) __builtin_popcountll(x)
#define ff first
#define ss second
#define conts continue
#define ceil2(x,y) ((x+y-1)/(y))
#define all(a) a.begin(), a.end()
#define rall(a) a.rbegin(), a.rend()
#define yes cout << "Yes" << endl
#define no cout << "No" << endl

#define rep(i,n) for(int i = 0; i < n; ++i)
#define rep1(i,n) for(int i = 1; i <= n; ++i)
#define rev(i,s,e) for(int i = s; i >= e; --i)
#define trav(i,a) for(auto &i : a)

template<typename T>
void amin(T &a, T b) {
    a = min(a,b);
}

template<typename T>
void amax(T &a, T b) {
    a = max(a,b);
}

#ifdef LOCAL
#include "debug.h"
#else
#define debug(...) 42
#endif

/*



*/

const int MOD = 1e9 + 7;
const int N = 1e5 + 5;
const int inf1 = int(1e9) + 5;
const ll inf2 = ll(1e18) + 5;

ll bexp(ll a, ll b) {
    a %= MOD;
    if (a == 0) return 0;

    ll res = 1;

    while (b) {
        if (b & 1) res = res * a % MOD;
        a = a * a % MOD;
        b >>= 1;
    }

    return res;
}

ll invmod(ll a) {
    return bexp(a, MOD - 2);
}

void solve(int test_case)
{
    ll n; cin >> n;
    vector<ll> a(n+5);
    rep1(i,n) cin >> a[i];

    auto f = [&](ll l, ll r){
        if(l > r) return 0ll;
        ll res = r*(r+1)-l*(l-1);
        return (res>>1)%MOD;
    };
    
    ll ans = 0;
    ll inv2 = invmod(2);
    ll invn = invmod(n);
    
    rep1(i,n){
        ll sum_middle_right = f(0,n-i-1);
        ll sum_middle_left = f(0,i-2);
        ll sum_side_right = f(1,n-(i+1)+1);
        ll sum_side_left = f(1,i-1);

        ll sum = inv2*(sum_middle_left+sum_middle_right)+sum_side_right+sum_side_left;
        sum %= MOD;

        /*

        rep1(j,n){
            if(i == j) conts;
            ll middle = abs(i-j)-1;
            sum += middle*inv2;
            if(i < j) sum += n-j+1;
            else sum += j;
            sum %= MOD;
        }
    
        */

        sum = sum*invn%MOD;
        ans += sum*a[i];
        ans %= MOD;
    }

    rep1(i,n){
        ans += a[i];
        ans %= MOD;
    }

    cout << ans << endl;
}

int main()
{
    fastio;

    int t = 1;
    cin >> t;

    rep1(i, t) {
        solve(i);
    }

    return 0;
}
Editorialist's code (PyPy3)
mod = 10**9 + 7

for _ in range(int(input())):
    n = int(input())
    a = list(map(int, input().split()))
    
    def sum_2n(n):
        if n <= 1: return 0
        return n*(n+1)//2 - 1
    
    inv_n = pow(n, mod-2, mod) % mod
    inv_2n = inv_n * (mod + 1)//2 % mod
    ans = sum(a) * inv_n % mod
    for i in range(n):
        val = a[i] * inv_2n % mod
        ans += val * (sum_2n(i)) % mod
        ans += val * (sum_2n(n-1-i)) % mod
        
        # prefix 
        if i > 0: ans += val * (i+1) % mod * (i+1) % mod
        if i+1 < n: ans += val * (n-i) % mod * (n-i) % mod
    print(ans % mod)
1 Like

I just love your editorials. Hats off !