TIKI - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: kingmessi
Tester: watoac2001
Editorialist: iceknight1093

DIFFICULTY:

TBD

PREREQUISITES:

NTT

PROBLEM:

N football players stand in a circle.
The probability that a player passes the ball to someone at distance d equals \frac{W_d}{S} (distributed equally among everyone at distance d), where S = sum(W).

A Tiki Taka is a sequence of k passes such that the path traced out by the ball forms a simple k-sided polygon that doesn’t touch or contain the center of the circle.

The score of a sequence of passes is the number of Tiki Takas in it.
Find the expected score of a sequence of M passes.

EXPLANATION:

First off, we immediately apply linearity of expectation.
Let p_i be the probability that passes (i, i+1, i+2, \ldots, i+k-1) form a Tiki Taka.
Then, the answer is the sum of p_i for all i from 1 to M-k+1.

All of these p_i are clearly equal, so it’s enough to compute p_1: the probability that the first k passes form a Tiki Taka - this can then be multiplied by (M-k+1).

So, consider a sequence of k passes. When exactly will it form a simple k-sided polygon that doesn’t include the center of the circle?

Answer

Let the sequence of passes be P_1 \to P_2 \to\ldots \to P_k \to P_{k+1}.
For 1 \leq i \leq k, let d_i denote the direction in which the distance from P_i to P_{i+1} was computed (either clockwise or anticlockwise).
Then, the k-sided polygon won’t include the center if and only if exactly one of the distances was computed in a direction different from the others.

Why?

Without loss of generality, suppose P_1 \to P_2 is computed clockwise.
Consider the diameter passing through P_1.
If the sequence of passes crosses this diameter, then upon closing the polygon it’ll contain the center of the circle, which isn’t allowed.
So, the sequence of passes must all lie on one side of this diameter.

Then, for the polygon to be non-crossing, the only possible way is for each of the passes P_1\to P-2, P_2\to P_3, \ldots, P_{k-1}\to P_k to be computed clockwise, and P_k \to P_{k+1} to be computed anticlockwise.

Let’s fix P_1 = 1 for now.
The above observation then tells us that we should choose k-1 passes starting from 1 such that they end at some location not crossing the diameter passing through 1, and then go from there back to 1.

This gives us the starting point of an algorithm.
Let dp[i][j] denote the probability that the ball starts with player 1, and is with player i after exactly j clockwise passes.
Let P_d be the probability that a pass is done to a player at distance d.
Note that P_d = \frac{W_d}{2S} where S = sum(W), because there will always be exactly two players at a distance of d from any given player (the only exception is d = \frac{N}{2} when N is even, but we’ll never use such a pass anyway since it passes through the center).

Then, iterating over the last person who had the ball before i, we have

dp[i][j] = \sum_{1 \leq x \lt i} P_{i-x}\cdot dp[x][j-1]

If we know dp[i][j] for every i and j, we’d be able to solve the problem fairly easily: fix P_k = i, and look at dp[i][k-1], which is the probability that we reach i with exactly k-1 clockwise passes.
Then, we need a final anticlockwise pass to get back to 1 from here, whose probability is exactly P_{i-1}.

Sum this up across all 1 \lt i \lt \frac{N}{2} and you’ll have the probability of getting a Tiki Taka that starts and ends at 1, moving clockwise except for the final pass.
This probability should then be multiplied by 2k, to account for that fact that:

  • The direction can be either clockwise or anticlockwise.
  • The starting point can be any one of the k points of the polygon, not just 1.

Our issue now is computing dp[i][j] fast, since it’s currently \mathcal{O}(N^3).

One observation you can make is that the array dp[\cdot][j] looks like a convolution of the array dp[\cdot][j-1] and the array P.
So, computing dp[i][j] in increasing order of j, and using NTT to perform the convolution quickly, all the dp[i][j] values can be computed in \mathcal{O}(N^2 \log N) time.

But we can do even better!
Notice that we don’t really care about the intermediate dp[i][j] values: we only care about dp[i][k-1].
Further, since we’re convolving with the array P in each step, our final array just looks like P convolved with itself a bunch of times - more specifically, a power of P.

That is, if we consider the polynomial

f(x) = P_1x + P_2x^2 + \ldots + P_m x^m

where m is the maximum power we care about (i.e, m = \left\lfloor \frac{N}{2} - 1 \right\rfloor),
what we’re looking for is exactly the first m coefficients of the (k-1)-th power of this polynomial f.

Computing f^{k-1}(x) can be done in \mathcal{O}(N\log N \log K) time using binary exponentiation - the only thing you need to make sure of is that after each multiplication, you reduce f to degree m so that each convolution remains \mathcal{O}(N\log N) (we don’t care about higher powers anyway).

With this, we know all the dp[i][k-1] values, so we’re done!

TIME COMPLEXITY:

\mathcal{O}(N\log N\log K) per testcase.

CODE:

Author's code (C++)
#include<bits/stdc++.h>

#include <cassert>
#include <numeric>
#include <type_traits>

#ifdef _MSC_VER
#include <intrin.h>
#endif


#include <utility>

#ifdef _MSC_VER
#include <intrin.h>
#endif

namespace atcoder {

namespace internal {

constexpr long long safe_mod(long long x, long long m) {
    x %= m;
    if (x < 0) x += m;
    return x;
}

struct barrett {
    unsigned int _m;
    unsigned long long im;

    explicit barrett(unsigned int m) : _m(m), im((unsigned long long)(-1) / m + 1) {}

    unsigned int umod() const { return _m; }

    unsigned int mul(unsigned int a, unsigned int b) const {

        unsigned long long z = a;
        z *= b;
#ifdef _MSC_VER
        unsigned long long x;
        _umul128(z, im, &x);
#else
        unsigned long long x =
            (unsigned long long)(((unsigned __int128)(z)*im) >> 64);
#endif
        unsigned long long y = x * _m;
        return (unsigned int)(z - y + (z < y ? _m : 0));
    }
};

constexpr long long pow_mod_constexpr(long long x, long long n, int m) {
    if (m == 1) return 0;
    unsigned int _m = (unsigned int)(m);
    unsigned long long r = 1;
    unsigned long long y = safe_mod(x, m);
    while (n) {
        if (n & 1) r = (r * y) % _m;
        y = (y * y) % _m;
        n >>= 1;
    }
    return r;
}

constexpr bool is_prime_constexpr(int n) {
    if (n <= 1) return false;
    if (n == 2 || n == 7 || n == 61) return true;
    if (n % 2 == 0) return false;
    long long d = n - 1;
    while (d % 2 == 0) d /= 2;
    constexpr long long bases[3] = {2, 7, 61};
    for (long long a : bases) {
        long long t = d;
        long long y = pow_mod_constexpr(a, t, n);
        while (t != n - 1 && y != 1 && y != n - 1) {
            y = y * y % n;
            t <<= 1;
        }
        if (y != n - 1 && t % 2 == 0) {
            return false;
        }
    }
    return true;
}
template <int n> constexpr bool is_prime = is_prime_constexpr(n);

constexpr std::pair<long long, long long> inv_gcd(long long a, long long b) {
    a = safe_mod(a, b);
    if (a == 0) return {b, 0};

    long long s = b, t = a;
    long long m0 = 0, m1 = 1;

    while (t) {
        long long u = s / t;
        s -= t * u;
        m0 -= m1 * u;  // |m1 * u| <= |m1| * s <= b


        auto tmp = s;
        s = t;
        t = tmp;
        tmp = m0;
        m0 = m1;
        m1 = tmp;
    }
    if (m0 < 0) m0 += b / s;
    return {s, m0};
}

constexpr int primitive_root_constexpr(int m) {
    if (m == 2) return 1;
    if (m == 167772161) return 3;
    if (m == 469762049) return 3;
    if (m == 754974721) return 11;
    if (m == 998244353) return 3;
    int divs[20] = {};
    divs[0] = 2;
    int cnt = 1;
    int x = (m - 1) / 2;
    while (x % 2 == 0) x /= 2;
    for (int i = 3; (long long)(i)*i <= x; i += 2) {
        if (x % i == 0) {
            divs[cnt++] = i;
            while (x % i == 0) {
                x /= i;
            }
        }
    }
    if (x > 1) {
        divs[cnt++] = x;
    }
    for (int g = 2;; g++) {
        bool ok = true;
        for (int i = 0; i < cnt; i++) {
            if (pow_mod_constexpr(g, (m - 1) / divs[i], m) == 1) {
                ok = false;
                break;
            }
        }
        if (ok) return g;
    }
}
template <int m> constexpr int primitive_root = primitive_root_constexpr(m);

unsigned long long floor_sum_unsigned(unsigned long long n,
                                      unsigned long long m,
                                      unsigned long long a,
                                      unsigned long long b) {
    unsigned long long ans = 0;
    while (true) {
        if (a >= m) {
            ans += n * (n - 1) / 2 * (a / m);
            a %= m;
        }
        if (b >= m) {
            ans += n * (b / m);
            b %= m;
        }

        unsigned long long y_max = a * n + b;
        if (y_max < m) break;
        n = (unsigned long long)(y_max / m);
        b = (unsigned long long)(y_max % m);
        std::swap(m, a);
    }
    return ans;
}

}  // namespace internal

}  // namespace atcoder


#include <cassert>
#include <numeric>
#include <type_traits>

namespace atcoder {

namespace internal {

#ifndef _MSC_VER
template <class T>
using is_signed_int128 =
    typename std::conditional<std::is_same<T, __int128_t>::value ||
                                  std::is_same<T, __int128>::value,
                              std::true_type,
                              std::false_type>::type;

template <class T>
using is_unsigned_int128 =
    typename std::conditional<std::is_same<T, __uint128_t>::value ||
                                  std::is_same<T, unsigned __int128>::value,
                              std::true_type,
                              std::false_type>::type;

template <class T>
using make_unsigned_int128 =
    typename std::conditional<std::is_same<T, __int128_t>::value,
                              __uint128_t,
                              unsigned __int128>;

template <class T>
using is_integral = typename std::conditional<std::is_integral<T>::value ||
                                                  is_signed_int128<T>::value ||
                                                  is_unsigned_int128<T>::value,
                                              std::true_type,
                                              std::false_type>::type;

template <class T>
using is_signed_int = typename std::conditional<(is_integral<T>::value &&
                                                 std::is_signed<T>::value) ||
                                                    is_signed_int128<T>::value,
                                                std::true_type,
                                                std::false_type>::type;

template <class T>
using is_unsigned_int =
    typename std::conditional<(is_integral<T>::value &&
                               std::is_unsigned<T>::value) ||
                                  is_unsigned_int128<T>::value,
                              std::true_type,
                              std::false_type>::type;

template <class T>
using to_unsigned = typename std::conditional<
    is_signed_int128<T>::value,
    make_unsigned_int128<T>,
    typename std::conditional<std::is_signed<T>::value,
                              std::make_unsigned<T>,
                              std::common_type<T>>::type>::type;

#else

template <class T> using is_integral = typename std::is_integral<T>;

template <class T>
using is_signed_int =
    typename std::conditional<is_integral<T>::value && std::is_signed<T>::value,
                              std::true_type,
                              std::false_type>::type;

template <class T>
using is_unsigned_int =
    typename std::conditional<is_integral<T>::value &&
                                  std::is_unsigned<T>::value,
                              std::true_type,
                              std::false_type>::type;

template <class T>
using to_unsigned = typename std::conditional<is_signed_int<T>::value,
                                              std::make_unsigned<T>,
                                              std::common_type<T>>::type;

#endif

template <class T>
using is_signed_int_t = std::enable_if_t<is_signed_int<T>::value>;

template <class T>
using is_unsigned_int_t = std::enable_if_t<is_unsigned_int<T>::value>;

template <class T> using to_unsigned_t = typename to_unsigned<T>::type;

}  // namespace internal

}  // namespace atcoder


namespace atcoder {

namespace internal {

struct modint_base {};
struct static_modint_base : modint_base {};

template <class T> using is_modint = std::is_base_of<modint_base, T>;
template <class T> using is_modint_t = std::enable_if_t<is_modint<T>::value>;

}  // namespace internal

template <int m, std::enable_if_t<(1 <= m)>* = nullptr>
struct static_modint : internal::static_modint_base {
    using mint = static_modint;

  public:
    static constexpr int mod() { return m; }
    static mint raw(int v) {
        mint x;
        x._v = v;
        return x;
    }

    static_modint() : _v(0) {}
    template <class T, internal::is_signed_int_t<T>* = nullptr>
    static_modint(T v) {
        long long x = (long long)(v % (long long)(umod()));
        if (x < 0) x += umod();
        _v = (unsigned int)(x);
    }
    template <class T, internal::is_unsigned_int_t<T>* = nullptr>
    static_modint(T v) {
        _v = (unsigned int)(v % umod());
    }

    unsigned int val() const { return _v; }

    mint& operator++() {
        _v++;
        if (_v == umod()) _v = 0;
        return *this;
    }
    mint& operator--() {
        if (_v == 0) _v = umod();
        _v--;
        return *this;
    }
    mint operator++(int) {
        mint result = *this;
        ++*this;
        return result;
    }
    mint operator--(int) {
        mint result = *this;
        --*this;
        return result;
    }

    mint& operator+=(const mint& rhs) {
        _v += rhs._v;
        if (_v >= umod()) _v -= umod();
        return *this;
    }
    mint& operator-=(const mint& rhs) {
        _v -= rhs._v;
        if (_v >= umod()) _v += umod();
        return *this;
    }
    mint& operator*=(const mint& rhs) {
        unsigned long long z = _v;
        z *= rhs._v;
        _v = (unsigned int)(z % umod());
        return *this;
    }
    mint& operator/=(const mint& rhs) { return *this = *this * rhs.inv(); }

    mint operator+() const { return *this; }
    mint operator-() const { return mint() - *this; }

    mint pow(long long n) const {
        assert(0 <= n);
        mint x = *this, r = 1;
        while (n) {
            if (n & 1) r *= x;
            x *= x;
            n >>= 1;
        }
        return r;
    }
    mint inv() const {
        if (prime) {
            assert(_v);
            return pow(umod() - 2);
        } else {
            auto eg = internal::inv_gcd(_v, m);
            assert(eg.first == 1);
            return eg.second;
        }
    }

    friend mint operator+(const mint& lhs, const mint& rhs) {
        return mint(lhs) += rhs;
    }
    friend mint operator-(const mint& lhs, const mint& rhs) {
        return mint(lhs) -= rhs;
    }
    friend mint operator*(const mint& lhs, const mint& rhs) {
        return mint(lhs) *= rhs;
    }
    friend mint operator/(const mint& lhs, const mint& rhs) {
        return mint(lhs) /= rhs;
    }
    friend bool operator==(const mint& lhs, const mint& rhs) {
        return lhs._v == rhs._v;
    }
    friend bool operator!=(const mint& lhs, const mint& rhs) {
        return lhs._v != rhs._v;
    }

  private:
    unsigned int _v;
    static constexpr unsigned int umod() { return m; }
    static constexpr bool prime = internal::is_prime<m>;
};

template <int id> struct dynamic_modint : internal::modint_base {
    using mint = dynamic_modint;

  public:
    static int mod() { return (int)(bt.umod()); }
    static void set_mod(int m) {
        assert(1 <= m);
        bt = internal::barrett(m);
    }
    static mint raw(int v) {
        mint x;
        x._v = v;
        return x;
    }

    dynamic_modint() : _v(0) {}
    template <class T, internal::is_signed_int_t<T>* = nullptr>
    dynamic_modint(T v) {
        long long x = (long long)(v % (long long)(mod()));
        if (x < 0) x += mod();
        _v = (unsigned int)(x);
    }
    template <class T, internal::is_unsigned_int_t<T>* = nullptr>
    dynamic_modint(T v) {
        _v = (unsigned int)(v % mod());
    }

    unsigned int val() const { return _v; }

    mint& operator++() {
        _v++;
        if (_v == umod()) _v = 0;
        return *this;
    }
    mint& operator--() {
        if (_v == 0) _v = umod();
        _v--;
        return *this;
    }
    mint operator++(int) {
        mint result = *this;
        ++*this;
        return result;
    }
    mint operator--(int) {
        mint result = *this;
        --*this;
        return result;
    }

    mint& operator+=(const mint& rhs) {
        _v += rhs._v;
        if (_v >= umod()) _v -= umod();
        return *this;
    }
    mint& operator-=(const mint& rhs) {
        _v += mod() - rhs._v;
        if (_v >= umod()) _v -= umod();
        return *this;
    }
    mint& operator*=(const mint& rhs) {
        _v = bt.mul(_v, rhs._v);
        return *this;
    }
    mint& operator/=(const mint& rhs) { return *this = *this * rhs.inv(); }

    mint operator+() const { return *this; }
    mint operator-() const { return mint() - *this; }

    mint pow(long long n) const {
        assert(0 <= n);
        mint x = *this, r = 1;
        while (n) {
            if (n & 1) r *= x;
            x *= x;
            n >>= 1;
        }
        return r;
    }
    mint inv() const {
        auto eg = internal::inv_gcd(_v, mod());
        assert(eg.first == 1);
        return eg.second;
    }

    friend mint operator+(const mint& lhs, const mint& rhs) {
        return mint(lhs) += rhs;
    }
    friend mint operator-(const mint& lhs, const mint& rhs) {
        return mint(lhs) -= rhs;
    }
    friend mint operator*(const mint& lhs, const mint& rhs) {
        return mint(lhs) *= rhs;
    }
    friend mint operator/(const mint& lhs, const mint& rhs) {
        return mint(lhs) /= rhs;
    }
    friend bool operator==(const mint& lhs, const mint& rhs) {
        return lhs._v == rhs._v;
    }
    friend bool operator!=(const mint& lhs, const mint& rhs) {
        return lhs._v != rhs._v;
    }

  private:
    unsigned int _v;
    static internal::barrett bt;
    static unsigned int umod() { return bt.umod(); }
};
template <int id> internal::barrett dynamic_modint<id>::bt(998244353);

using modint998244353 = static_modint<998244353>;
using modint1000000007 = static_modint<1000000007>;
using modint = dynamic_modint<-1>;

namespace internal {

template <class T>
using is_static_modint = std::is_base_of<internal::static_modint_base, T>;

template <class T>
using is_static_modint_t = std::enable_if_t<is_static_modint<T>::value>;

template <class> struct is_dynamic_modint : public std::false_type {};
template <int id>
struct is_dynamic_modint<dynamic_modint<id>> : public std::true_type {};

template <class T>
using is_dynamic_modint_t = std::enable_if_t<is_dynamic_modint<T>::value>;

}  // namespace internal

}  // namespace atcoder

#include <ext/pb_ds/assoc_container.hpp> // Common file
#include <ext/pb_ds/tree_policy.hpp>
#define ll long long
// #define int long long
#define rep(i,a,b) for(int i=a;i<b;i++)
#define rrep(i,a,b) for(int i=a;i>=b;i--)
#define repin rep(i,0,n)
#define di(a) int a;cin>>a;
#define precise(i) cout<<fixed<<setprecision(i)
#define vi vector<int>
#define si set<int>
#define mii map<int,int>
#define take(a,n) for(int j=0;j<n;j++) cin>>a[j];
#define give(a,n) for(int j=0;j<n;j++) cout<<a[j]<<' ';
#define vpii vector<pair<int,int>>
#define sis string s;
#define sin string s;cin>>s;
#define db double
#define be(x) x.begin(),x.end()
#define pii pair<int,int>
#define pb push_back
#define pob pop_back
#define ff first
#define ss second
#define lb lower_bound
#define ub upper_bound
#define bpc(x) __builtin_popcountll(x) 
#define btz(x) __builtin_ctz(x)
using namespace std;
using namespace atcoder;
using namespace __gnu_pbds;

typedef tree<int, null_type, less<int>, rb_tree_tag,tree_order_statistics_node_update> ordered_set;
typedef tree<pair<int, int>, null_type,less<pair<int, int> >, rb_tree_tag,tree_order_statistics_node_update> ordered_multiset;

const long long INF=1e18;
const long long M=1e9+7;
const long long MM=998244353;
using mint = static_modint<MM>;
  
int power( int N, int M){
    int power = N, sum = 1;
    if(N == 0) sum = 0;
    while(M > 0){if((M & 1) == 1){sum *= power;}
    power = power * power;M = M >> 1;}
    return sum;
}

const int MOD = 998244353,rt = 5;
template<class T> void fft(vector<T>& A, bool invert = 0) { // NTT
    int n = A.size();
    // assert(((T::mod)-1)%n == 0); 
    vector<T> B(n);
    for(int b = n/2; b; b /= 2, swap(A,B)) { // w = n/b'th root
        T w = ((T)(rt)).pow(((T::mod())-1)/n*b), m = 1; 
        for(int i = 0; i < n; i += b*2, m *= w) rep(j,0,b) {
            T u = A[i+j], v = A[i+j+b]*m;
            B[i/2+j] = u+v; B[i/2+j+n/2] = u-v;
        }
    }
    if (invert) { reverse(1+be(A)); 
        T z = ((T)(n)).inv(); for(auto &t : A) t *= z; }
} // for NTT-able moduli
template<class T> vector<T> conv(vector<T> A, vector<T> B) {
    if (!min(A.size(),B.size())) return {};
    int s = A.size()+B.size()-1, n = 1; for (; n < s; n *= 2);
    A.resize(n), fft(A); B.resize(n), fft(B);
    repin A[i] *= B[i];
    fft(A,1); A.resize(s); return A;
}
template<class M, class T> vector<M> mulMod(const vector<T>& x, const vector<T>& y) {
    auto con = [](const vector<T>& v) {
        vector<M> w(v.size()); rep(i,0,v.size()) w[i] = (int)v[i].val();
        return w; };
    return conv(con(x), con(y));
} // arbitrary moduli
template<class T> vector<T> conv_general(const vector<T>& A, const vector<T>& B) {
    using m0 = static_modint<(119<<23)+1>; auto c0 = mulMod<m0>(A,B);
    using m1 = static_modint<(5<<25)+1>; auto c1 = mulMod<m1>(A,B);
    using m2 = static_modint<(7<<26)+1>; auto c2 = mulMod<m2>(A,B);
    int n = c0.size(); vector<T> res(n); m1 r01 = ((m1)(m0::mod())).inv(); 
    m2 r02 = ((m2)(m0::mod())).inv();m2 r12 = ((m2)(m1::mod())).inv();
    repin { // a=remainder mod m0::mod, b fixes it mod m1::mod
        int a = c0[i].val(), b = ((c1[i]-a)*r01).val(), 
            c = (((c2[i]-a)*r02-b)*r12).val();
        res[i] = (T(c)*(m1::mod())+b)*(m0::mod())+a; // c fixes m2::mod
    }
    return res;
}

vector<mint> V[20];

 
void solve()
{
	int n,m,p;
	cin >> n >> m >> p;
	// cin >> n;
	rep(i,0,31)V[i].clear();
	vi a(n/2);
	take(a,n/2);
	mint sm = 0;
	for(auto x : a)sm += x;
	if(m < p){
		cout << "0\n";return;
	}
	if(p-1 > n/2){
		cout << "0\n";return;
	}
	vector<mint> b(n);
	b[0] = 0;
	if(n%2){
		rep(i,1,n/2+1){
			b[i] = a[i-1];
			b[i] /= 2;
		}
		rep(i,n/2+1,n){
			b[i] = a[n/2-1-(i-n/2-1)];
			b[i] /= 2;
		}
	}
	else{
		rep(i,1,n/2){
			b[i] = a[i-1];
			b[i] /= 2;
		}
		b[n/2] = a[n/2-1];
		rep(i,n/2+1,n){
			assert(n/2-2-(i-n/2-1) >= 0);
			b[i] = a[n/2-2-(i-n/2-1)];
			b[i] /= 2;
		}
	}
	// give(b,n);cout << "\n";
	// for(auto x : b){
	// 	cout << x.val() << " ";
	// }cout << "\n";
	assert(sm != 0);
	repin{
		b[i] /= sm;
	}
	int sz = (n+1)/2;
	vector<mint> c(sz);
	rep(i,0,sz)c[i] = b[i];
	V[0] = c;
	rep(i,1,21){
		if((1<<i) > p)break;
		vector<mint> temp = conv(V[i-1],V[i-1]);
		while(temp.size() != sz)temp.pob();
		V[i] = temp;
	}
	// rep(i,0,30){
	// 	cout << i << " : ";
	// 	for(auto x : V[i]){
	// 		cout << x.val() << " ";
	// 	}cout << "\n";
	// }
	vector<mint> ans(sz,0);
	ans[0] = 1;
	p--;
	rep(i,0,21){
		if(p&(1<<i)){
			vector<mint> res = conv(V[i],ans);
			while(res.size() != sz)res.pob();
			ans = res;
		}
	}
	// for(auto x : ans){
	// 	cout << x.val() << " ";
	// }cout << "\n";
	mint fin = 0;
	rep(i,0,sz){
		fin += ans[i]*b[i];
	}
	p++;
	// cout << m << "\n";
	// cout << p << "\n";
	// fin *= n;
	fin *= p; // may start from anywhere
	fin *= 2; // may be clockwise anticlockwise
	fin *= (m-p+1); // linear EV
	cout << fin.val() << "\n";

	
	// mint res = a[0];
	// res /= 2;
	// res = res.pow(5);
	// res *= a[4];
	// res /= 2;
	// mint res1 = sm;
	// res1 = res1.pow(6);
	// res1 = res1.pow(MM-2);
	// res *= res1;
	// res *= (m-p+1);
	// res *= 12;
	// cout << res.val() << "\n";
}

signed main(){
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    cout.tie(0);
    #ifdef NCR
        init();
    #endif
    #ifdef SIEVE
        sieve();
    #endif
        int t; cin >> t; while(t--)
        solve();
    return 0;
}
Tester's code (C++)
#include<bits/stdc++.h>

using namespace std;

typedef long long ll;
typedef long double ld;
typedef vector<ll> vl;
typedef vector<int> vi;

#define rep(i, a, b) for(int i = a; i < b; i++)
#define all(x) begin(x), end(x)
#define sz(x) static_cast<int>((x).size())
//#define int long long

const ll mod = (119 << 23) + 1, root = 62; // = 998244353
// For p < 2^30 there is also e.g. 5 << 25, 7 << 26, 479 << 21
// and 483 << 21 (same root). The last two are > 10^9.
ll modpow (ll b, ll e) {
        ll ans = 1;
        for (; e; b = b * b % mod, e /= 2)
                if (e & 1) ans = ans * b % mod;
        return ans;
}
void ntt (vl &a) {
        int n = sz(a), L = 31 - __builtin_clz(n);
        static vl rt(2, 1);
        for (static int k = 2, s = 2; k < n; k *= 2, s++) {
                rt.resize(n);
                ll z[] = {1, modpow(root, mod >> s)};
                rep(i,k,2*k) rt[i] = rt[i / 2] * z[i & 1] % mod;
        }
        vi rev(n);
        rep(i,0,n) rev[i] = (rev[i / 2] | (i & 1) << L) / 2;
        rep(i,0,n) if (i < rev[i]) swap(a[i], a[rev[i]]);
        for (int k = 1; k < n; k *= 2)
                for (int i = 0; i < n; i += 2 * k) rep(j,0,k) {
                        ll z = rt[j + k] * a[i + j + k] % mod, &ai = a[i + j];
                        a[i + j + k] = ai - z + (z > ai ? mod : 0);
                        ai += (ai + z >= mod ? z - mod : z);
                }
}
vl conv (const vl &a, const vl &b) {
        if (a.empty() || b.empty()) return {};
        int s = sz(a) + sz(b) - 1, B = 32 - __builtin_clz(s), n = 1 << B;
        int inv = modpow(n, mod - 2);
        vl L(a), R(b), out(n);
        L.resize(n), R.resize(n);
        ntt(L), ntt(R);
        rep(i,0,n) out[-i & (n - 1)] = (ll)L[i] * R[i] % mod * inv % mod;
        ntt(out);
        return {out.begin(), out.begin() + s};
}

ll power (ll x, int a) {
        ll y = 1;
        while (a) {
                if (a & 1) y = y * x % mod;
                x = x * x % mod;
                a >>= 1;
        }
        return y;
}

vl power (vl x, int a, int n) {
        vl y = {1};
        while (a) {
                if (a & 1) {
                        y = conv(y, x);
                        while (sz(y) > n) y.pop_back();
                }
                x = conv(x, x);
                while (sz(x) > n) x.pop_back();
                a >>= 1;
        }
        return y;
}

signed main() {

        ios::sync_with_stdio(0);
        cin.tie(0);

        int t;
        cin >> t;

        while (t--) {

                int n, m, p;
                cin >> n >> m >> p;
                int a[n / 2 + 1] = {0};
                for (int i = 1; i <= n / 2; i++) cin >> a[i];
                vl b((n + 1) / 2, 0ll);
                ll sm = accumulate(a, a + n / 2 + 1, 0ll);
                sm %= mod;

                ll inv = power(sm, mod - 2);
                ll i2 = power(2, mod - 2);
                for (int i = 1; i < (n + 1) / 2; i++) {
                        b[i] = a[i] * inv % mod * i2 % mod;
                        if (n % 2 == 0 && i == n / 2) {
                                b[i] = a[i] * inv % mod;
                        }
                }

                vl c = power(b, p - 1, n);
                vl f(n, 0);

                for (int i = n / 2 + 1; i < n; i++) f[i] = a[n - i] * inv % mod * i2 % mod;

                vl res = conv(f, c);

                if (sz(res) <= n || m < p) {
                        cout << 0 << "\n";
                        continue;
                }
                ll ans = res[n] * 2 * p % mod * (m - p + 1) % mod;
                cout << ans % mod << "\n";
                

        }

        
}