TRIPWAYS - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2

Setter: David Stolp
Tester: Felipe Mota
Editorialist: Taranpreet Singh

DIFFICULTY

Super-Hard

PREREQUISITES

Generating Functions, Linear Recurrences

PROBLEM

Given a road network consisting of N cities connected by M roads where i-th city has L_i tourism spots. Starting at city 1 on day 1, each day, Chef may either

  • Move to a higher numbered city directly connected to his current city
  • Visit a tourism spot in his current city.

For some given D, he must reach city N from city 1 in exactly D days. Find the number of vacation plans possible for each queried D. There are Q such queries.

Two vacation plans (for a fixed duration of the vacation) are considered different if there is at least one day such that Chef performs different actions in the two plans on that day. Visiting different tourist attractions also counts as different actions.

QUICK EXPLANATION

EXPLANATION

This editorial details various solutions (in increasing order of complexity) to solve Linear Recurrences. Feel free to jump to various sections at your own risk.

Naive intuition

We’ll start with a basic intuition, building up sub-task wise solutions.

It’s quite tempting to start with dynamic programming, defined by the state (current city, number of days left).

DP_{u, d} denoting the number of vacation plans such that we are currently in city u after d days. As the base case, we have DP_{1, 0} = 1.

The transitions for this DP would be possible actions. Say adj_u denote the list of higher-numbered neighbors of u, we can either visit a tourism spot in the current city or move to a higher numbered neighbor.

So we have \displaystyle DP_{u, d} = L_u*DP_{u, d-1}+\sum_{v \in adj_{u}} DP_{v, d-1}. Since DP_{u, d} depend only number of ways corresponding to d-1 ways, we can compute this DP in increasing order of d, and DP_{N, D} shall store the required number of ways.

Analyzing this solution, we have N*D states and M*D transitions, this solution is intended only for sub-task 1.

Let’s see above recurrence again, we can see that it is a linear recurrence. Looking into our Linear recurrence toolbox, the first weapon we have is Matrix Exponentiation.

Matrix Exponentiation

A basic understanding of matrix exponentiation is required. Refer this ans this for detailed explanation.

Let us delve deeper into tourist attractions. For each city 1 \leq u \leq N, let’s ass L_u self loops. (Meaning L_u different edges, connecting u to itself).

Now, we can see that essentially we want the number of walks in this graph from 1 to N, taking exactly D steps. This is a well-known problem, as explained in detail here.

The idea is to maintain the number of paths from u to v for each pair (u, v) in a matrix. Matrix multiplication is defined as c_{i, k} = \sum_{j = 1}^N a_{i, j}*b_{j, k} for matrix multiplication C = A*B. Let’s assume G is a N \times N matrix where G_{u, v} denote the number of paths of length 1 (It’s easy to see it’s the adjacency matrix of given graph).

Then G*G shall denote the number of paths of length 2, G^K denoting the number of paths of length K. Hence, we just need to raise G to K-th power, G^K_{1, N} being the required number of ways.

This idea involved raising matrix to power, using Binary exponentiation, taking O(Q*N^3*log(D)) time, which is sufficient for subtask 2 only.

Enhancing Matrix Exponentiation

We are doing a lot of repetitive work here. For each query, we are computing G^p for p being powers of 2 again and again. Let’s compute these beforehand.

Now, for each query, let’s write the binary representation of D and find all set bits. Say D = 13 = (1011)_2. We need to compute G^8 * G^2 * G^1. Computing these products still takes N^3 time, we need to be smarter.

Let F be a column vector, F_i denoting the number of ways to reach city i from city 1. Initially, we have F_1 = 1 and F_i = 0 for i > 1. We can see that G*F shall also be a column vector, denoting the number of ways to reach city i from city 1 in one day. Similarly, G^K * F is also a column vector, ith entry denoting the number of ways to reach city i from city 1 in K days. Hence, we need to compute N-th value in G^8 * (G^2 * (G^1 * F)). We can perform vector-matrix multiplication in O(N^2) time, hence reducing the overall time complexity to O((N+Q)*N^2*log(D)) which shall also pass sub-task 4.

Refer Code here
import java.util.*;
import java.io.*;
import java.text.*;
class TRIPWAYS{
    //SOLUTION BEGIN
    int MOD = (int)1e9+7;
    long BIG = 8*(long)MOD*MOD;
    void pre() throws Exception{}
    void solve(int TC) throws Exception{
	int n = ni(), m = ni(), q = ni();
	int B = 60;
	int[][][] trans = new int[B][][];
	trans[0] = new int[n][n];
	for(int i = 0; i< n; i++)trans[0][i][i] = ni();
	for(int i = 0; i< m; i++)
	    trans[0][ni()-1][ni()-1] = 1;
	
	for(int b = 1; b< B; b++)
	    trans[b] = mul(trans[b-1], trans[b-1]);
	
	while(q-->0){
	    long D = nl();
	    int[] v = new int[n];
	    v[0] = 1;
	    for(int b = 0; b< B; b++)
	        if(((D>>b)&1)==1)v = mul(v, trans[b]);
	    pn(v[n-1]);
	}
    }
    int[][] mul(int[][] a, int[][] b){
	int n = a.length;
	int[][] ans = new int[n][n];
	for(int i = 0; i< n; i++){
	    long[] sum = new long[n];
	    for(int k = 0; k< n; k++){
	        for(int j = 0; j< n; j++){
	            sum[j] += (long)a[i][k]*b[k][j];
	            if(sum[j] >= BIG)sum[j] -= BIG;
	        }
	    }
	    for(int j = 0; j< n; j++)
	        ans[i][j] = (int)(sum[j]%MOD);
	}
	return ans;
    }
    int[] mul(int[] a, int[][] mat){
	int n = a.length;
	int[] c = new int[n];
	for(int i = 0; i< n; i++){
	    long sum = 0;
	    for(int j = 0; j< n; j++){
	        sum += (long)mat[j][i]*a[j];
	        if(sum >= BIG)sum -= BIG;
	    }
	    c[i] = (int)(sum%MOD);
	}
	return c;
    }
    //SOLUTION END
    void hold(boolean b)throws Exception{if(!b)throw new Exception("Hold right there, Sparky!");}
    DecimalFormat df = new DecimalFormat("0.00000000000");
    static boolean multipleTC = false;
    FastReader in;PrintWriter out;
    void run() throws Exception{
	in = new FastReader();
	out = new PrintWriter(System.out);
	//Solution Credits: Taranpreet Singh
	int T = (multipleTC)?ni():1;
	pre();for(int t = 1; t<= T; t++)solve(t);
	out.flush();
	out.close();
    }
    public static void main(String[] args) throws Exception{
	new TRIPWAYS().run();
    }
    int bit(long n){return (n==0)?0:(1+bit(n&(n-1)));}
    void p(Object o){out.print(o);}
    void pn(Object o){out.println(o);}
    void pni(Object o){out.println(o);out.flush();}
    String n()throws Exception{return in.next();}
    String nln()throws Exception{return in.nextLine();}
    int ni()throws Exception{return Integer.parseInt(in.next());}
    long nl()throws Exception{return Long.parseLong(in.next());}
    double nd()throws Exception{return Double.parseDouble(in.next());}

    class FastReader{
	BufferedReader br;
	StringTokenizer st;
	public FastReader(){
	    br = new BufferedReader(new InputStreamReader(System.in));
	}

	public FastReader(String s) throws Exception{
	    br = new BufferedReader(new FileReader(s));
	}

	String next() throws Exception{
	    while (st == null || !st.hasMoreElements()){
	        try{
	            st = new StringTokenizer(br.readLine());
	        }catch (IOException  e){
	            throw new Exception(e.toString());
	        }
	    }
	    return st.nextToken();
	}

	String nextLine() throws Exception{
	    String str = "";
	    try{   
	        str = br.readLine();
	    }catch (IOException e){
	        throw new Exception(e.toString());
	    }  
	    return str;
	}
    }
}

Characteristic Polynomials

Referring this for the understanding of characteristic polynomial, we can observe that the adjacency matrix of our graph is a triangular matrix, which results in the characteristic polynomial of our matrix being the product of diagonal entries. Effectively, the characteristic polynomial of our linear recurrence is P(x) = \displaystyle\prod_{u = 1}^N (x-L_u). For small D, we can naively compute using the solution in subtask 1 and then for larger D, use the linear recurrence \displaystyle ways_D = \sum_{i = 1}^N P_i*ways_{D-i}, giving O(N*(M+MAXD)) time complexity which is good enough for sub-task 3.

Berlekamp Massey algorithm

For those who have read this blog, you must be tempted to try applying it here. You may compute the first 2*N terms (Or maybe up to 3*N) terms using the naive solution, run Berlekamp Massey Algorithm to obtain linear recurrence (Shall be same as Characteristic Polynomial in irreducible form as Berlekamp-Massey algorithm returns minimal linear recurrence.).

The running time with this approach would be O(N*M+Q*N*log(N)*log(D)) if used FFT. It might also help to answer queries in a trie manner to save computations, as reflected in the following code.

Berlekamp Solution
#include <bits/stdc++.h>
#include <unordered_map>
using namespace std;
template<typename T = int> vector<T> create(size_t n){ return vector<T>(n); }
template<typename T, typename... Args> auto create(size_t n, Args... args){ return vector<decltype(create<T>(args...))>(n, create<T>(args...)); }
template<typename T = int, T mod = 1'000'000'007, typename U = long long>
struct umod{
    T val;
    umod(): val(0){}
    umod(U x){ x %= mod; if(x < 0) x += mod; val = x;}
    umod& operator += (umod oth){ val += oth.val; if(val >= mod) val -= mod; return *this; }
    umod& operator -= (umod oth){ val -= oth.val; if(val < 0) val += mod; return *this; }
    umod& operator *= (umod oth){ val = ((U)val) * oth.val % mod; return *this; }
    umod& operator /= (umod oth){ return *this *= oth.inverse(); }
    umod& operator ^= (U oth){ return *this = pwr(*this, oth); }
    umod operator + (umod oth) const { return umod(*this) += oth; }
    umod operator - (umod oth) const { return umod(*this) -= oth; }
    umod operator * (umod oth) const { return umod(*this) *= oth; }
    umod operator / (umod oth) const { return umod(*this) /= oth; }
    umod operator ^ (long long oth) const { return umod(*this) ^= oth; }
    bool operator < (umod oth) const { return val < oth.val; }
    bool operator > (umod oth) const { return val > oth.val; }
    bool operator <= (umod oth) const { return val <= oth.val; }
    bool operator >= (umod oth) const { return val >= oth.val; }
    bool operator == (umod oth) const { return val == oth.val; }
    bool operator != (umod oth) const { return val != oth.val; }
    umod pwr(umod a, U b) const { umod r = 1; for(; b; a *= a, b >>= 1) if(b&1) r *= a; return r; }
    umod inverse() const {
	U a = val, b = mod, u = 1, v = 0;
	while(b){
	    U t = a/b;
	    a -= t * b; swap(a, b);
	    u -= t * v; swap(u, v);
	}
	if(u < 0)
	    u += mod;
	return u;
    }
};
using U = umod<>;
struct cd {
	double x, y;
	cd(): x(0), y(0){}
	cd(const double &x, const double &y): x(x), y(y){}
	inline double real(){ return x; }
	inline double imag(){ return y; }
};
inline cd operator + (const cd &a, const cd &b){ return cd(a.x + b.x, a.y + b.y); }
inline cd operator - (const cd &a, const cd &b){ return cd(a.x - b.x, a.y - b.y); }
inline cd operator * (const cd &a, const cd &b){ return cd(a.x * b.x - a.y * b.y, a.x * b.y + a.y * b.x); }
inline cd conj(const cd &a){ return cd(a.x, -a.y); }
// reference https://github.com/Nisiyama-Suzune/TGoP/blob/master/src/number/int_fft.cpp
template <int L = 15, int MOD = 1'000'000'007>
struct int_fft {

	static const int BN = 1<<13;
	int N;
	const double PI = acos(-1);

	using complex = cd;

	int MASK; complex w[BN];
	int rev[BN];

	int_fft (int N): N(N) {
		MASK = (1 << L) - 1;
		for (int i = 0; i < N; ++i)
			w[i] = complex (cos(2 * i * PI / N), sin(2 * i * PI / N));
		for (int i = 1, j = 0; i < N - 1; ++i) {
			for (int s = N; j ^= s >>= 1, ~j & s;);
			rev[i] = j;
		}
	}

	void FFT (complex p[], int n) {
		for(int i = 1; i < n - 1; i++) if(i < rev[i]) swap(p[i], p[rev[i]]);
		for (int d = 0; (1 << d) < n; ++d) {
			int m = 1 << d, m2 = m * 2, rm = n >> (d + 1);
			for (int i = 0; i < n; i += m2) {
				for (int j = 0; j < m; ++j) {
					complex &p1 = p[i + j + m], &p2 = p[i + j];
					complex t = w[rm * j] * p1;
					p1 = p2 - t, p2 = p2 + t;
				}
			}
		}
	}

	complex A[BN], B[BN], C[BN], D[BN];

	void solve (int a[], int b[]) {
		for (int i = 0; i < N; ++i) {
			A[i] = complex (a[i] >> L, a[i] & MASK);
			B[i] = complex (b[i] >> L, b[i] & MASK);
		}
		FFT(A, N), FFT(B, N);
		for (int i = 0; i < N; ++i) {
			int j = (N - i) % N;
			complex da = (A[i] - conj(A[j])) * complex(0, -0.5),
					db = (A[i] + conj(A[j])) * complex(0.5, 0),
					dc = (B[i] - conj(B[j])) * complex(0, -0.5),
					dd = (B[i] + conj(B[j])) * complex(0.5, 0);
			C[j] = da * dd + da * dc * complex(0, 1);
			D[j] = db * dd + db * dc * complex(0, 1);
		}
		FFT(C, N), FFT(D, N);
		for (int i = 0; i < N; ++i) {
			long long da = (long long)(C[i].imag() / N + 0.5) % MOD,
					db = (long long)(C[i].real() / N + 0.5) % MOD,
					dc = (long long)(D[i].imag() / N + 0.5) % MOD,
					dd = (long long)(D[i].real() / N + 0.5) % MOD;
			a[i] = ((dd << (L * 2)) + ((db + dc) << L) + da) % MOD;
		}
	}

};
// reference - https://github.com/Nisiyama-Suzune/LMR/blob/master/src/mathematics/recurrence-relation/berlekamp-massey-algorithm.cpp
template<typename T>
struct berlekamp {
	struct poly {
		vector<T> a;
		poly(){ a.clear(); }
		poly(vector<T> & a): a(a){}
		int length() const {
			return a.size();
		}
		poly move(int d){
			vector<T> n(d, 0);
			n.insert(n.end(), a.begin(), a.end());
			return poly(n);
		}
		T calc(vector<T> &d, int pos){
			T ret = 0;
			for(int i = 0; i < (int)a.size(); i++)
				ret += d[pos - i] * a[i];
			return ret;
		}
		poly operator - (const poly &b){
			vector<T> n(max(length(), b.length()), 0);
			for(int i = 0; i < (int)n.size(); i++){
				T va = i < length() ? a[i] : 0;
				T vb = i < b.length() ? b.a[i] : 0;
				n[i] = va - vb;
			}
			return poly(n);
		}
	};
	poly sho(const T &c, const poly &p){
		vector<T> n(p.length());
		for(int i = 0; i < (int)n.size(); i++)
			n[i] = p.a[i] * c;
		return poly(n);
	}
	vector<T> solve(vector<T> a){
		int n = a.size();
		poly s, b;
		s.a.push_back(1);
		b.a.push_back(1);
		T ld = 1;
		for(int i = 0, j = -1; i < n; i++){
			T d = s.calc(a, i);
			if(d > 0){
				if((s.length() - 1) * 2 <= i){
					poly ob = b; b = s;
					s = s - sho((d / ld), ob.move(i - j));
					j = i;
					ld = d;
				} else {
					s = s - sho((d / ld), b.move(i - j));
				}
			}
		}
		return s.a;
	}
};
using poly = vector<U>;
poly prefix(poly x, int sz){
	int n = min((int)x.size(), sz);
	x.resize(n);
	return x;
}
poly reverse(poly x, int len = 0){
	if(len) x.resize(len, 0);
	reverse(x.begin(), x.end());
	return x;
}
poly operator * (poly a, int x){
	for(auto & e : a) e *= x;
	return a;
}
poly operator * (const poly &a, const poly &b){
	if(a.size() * b.size() <= 10100){
		poly res(a.size() + b.size(), 0);
		for(int i = 0; i < a.size(); i++){
			for(int j = 0; j < b.size(); j++){
				res[i + j] += a[i] * b[j];
			}
		}
		return res;
	}
	int len = 1, expec = a.size() + b.size();
	while(len < expec) len *= 2;
	int_fft<> fft(len);
	int va[len], vb[len];
	for(int i = 0; i < len; i++) if(i < a.size()) va[i] = a[i].val; else va[i] = 0;
	for(int i = 0; i < len; i++) if(i < b.size()) vb[i] = b[i].val; else vb[i] = 0;
	fft.solve(va, vb);
	poly res;
	for(int i = 0; i < a.size() + b.size(); i++) res.push_back(va[i]);
	while(res.size() && res.back() == 0) res.pop_back();
	return res;
}
poly operator - (poly a, poly b){
	poly res(max(a.size(), b.size()), 0);
	for(int i = 0; i < a.size(); i++) res[i] = a[i];
	for(int i = 0; i < b.size(); i++) res[i] -= b[i];
	while(res.size() && res.back() == 0) res.pop_back();
	return res;
}
poly inverse(poly x){
	poly ret = {x[0].inverse()};
	int n = x.size();
	for(int i = 1; i < n; i <<= 1){
		ret = prefix(ret * 2 - ret * ret * prefix(x, i << 1), i << 1);
	}
	return prefix(ret, n);
}
vector<poly> bins;
poly rec, inv, mod;
poly div(poly base){
	if(base.size() < mod.size()){
		base.clear();
		return base;
	}
	int n = base.size() - mod.size() + 1;
	return reverse(prefix((prefix(reverse(base), n) * prefix(inv, n)), n), n);
}
void build(){
	inv = inverse(reverse(mod));
	poly x = {0, 1};
	bins.push_back(x);
	for(int i = 1; i < 60; i++){
		poly c = bins[i - 1];
		c = c * c;
		c = c - div(c) * mod;
		bins.push_back(c);
	}
}
poly get(long long N){
	poly ret{1};
	for(int i = 0; i < 60; i++){
		if(N&(1ll<<i)){
			ret = ret * bins[i];
			ret = ret - div(ret) * mod;
		}
	}
	return ret;
}
const int M = 4040;
U ways[M][2 * M + 1];
vector<int> to[M];
int l[M];
const int T = 555 * 60;
int trie[T][2], nds = 1;
vector<int> qq[T];
int ans[T];
vector<poly> now;
void insert(long long val, int id){
	int cur = 0;
	for(int i = 0; i < 60; i++){
		int b = (val>>i)&1;
		if(trie[cur][b] == 0) trie[cur][b] = nds++;
		cur = trie[cur][b];
	}
	qq[cur].push_back(id);
}
void solve(int cur){
	if(now.size() == 61){
		U sum = 0;
		for(int i = 0; i < (int) now.back().size(); i++){
			sum += now.back()[i] * rec[i];
		}
		for(int id : qq[cur]) ans[id] = sum.val;
	}
	for(int i = 0; i < 2; i++){
		if(trie[cur][i]){
			if(i){
				poly p = now.back();
				p = p * bins[now.size() - 1];
				p = p - div(p) * mod;
				now.push_back(p);
			} else {
				now.push_back(now.back());
			}
			solve(trie[cur][i]);
			now.pop_back();
		}
	}
}
int main(){
	ios::sync_with_stdio(false);
	cin.tie(0);
	int n, m, q; cin >> n >> m >> q;
	for(int i = 0; i < n; i++) cin >> l[i];
	for(int i = 0; i < m; i++){
		int u, v; cin >> u >> v; u--; v--;
		to[u].push_back(v);
	}
	ways[n - 1][0] = 1;
	const int B = 2 * n;
	for(int i = n - 1; i >= 0; i--){
		for(int v : to[i]){
			for(int j = 0; j <= B; j++){
				ways[i][j + 1] += ways[v][j];
			}
		}
		for(int j = 0; j <= B; j++){
			ways[i][j + 1] += ways[i][j] * l[i];
		}
	}
	for(int j = 0; j <= B; j++) rec.push_back(ways[0][j]);
	berlekamp<U> bm;
	vector<U> trans = bm.solve(rec);
	trans = reverse(trans);
	mod = trans;
	build();
	// return 0;
	vector<long long> qs(q);
	for(int i = 0; i < q; i++){
		cin >> qs[i];
		if(qs[i] < rec.size()){
			ans[i] = rec[qs[i]].val;
			continue;
		}
		insert(qs[i], i);
	}
	now.push_back({1});
	solve(0);
	for(int i = 0; i < q; i++)
		cout << ans[i] << '\n';
	return 0;
}

Generating Functions

Let’s go back to the recurrence \displaystyle DP_{u, d} = L_u + \sum_{v \in adj_u} DP_{v, d-1}

Let’s think of this in terms of generating functions. Let P_u(x) be a polynomial such that the coefficient of x^d denotes the number of ways to reach city N from city u in exactly d days.

We can write above recurrence as \displaystyle P_u(x) = (1+L_u*x+(L_u*x)^2 \ldots) * x *\sum_{v \in adj_u} P_v(x). Secondly, the purpose of multiplying by x is that it effectively moves the coefficients of polynomial \displaystyle \sum_{v \in adj_u} P_v(x) to one degree higher, reflecting one day spent moving from city v to city u

The first term is sum of an infinite GP with common ratio L_u*x whose closed form is given as \displaystyle\frac{1}{1-L_u*x}, so our recurrence relation becomes \displaystyle P_u(x) = \frac{x}{1-L_u*x} * \sum_{v \in adj_u} P_v(x)

Our aim is to compute coefficient of x^D in P_1(x). As the base case, we have \displaystyle P_N(x) = \frac{1}{1-L_N*x}

The exact details are well explained here, the paper citing the general method here, So I’ll give you an intuition of applying binomial theorem, as mentioned in the linked post.

Let’s consider a path from city 1 to city N and e_1, e_2 \ldots e_k denote the k distinct values of L_u on path and c_i denote the number of cities with L_u = e_i Then applying binomial theorem here is equivalent to disttibuting D into atmost c_i bins which is given by \displaystyle {D+e_i-1 \choose e_i}.

The time complexity of this approach is O(M*N + Q*N*log(D)) per test case.

SOLUTIONS

Setter's Solution
#ifndef TASK
#define TASK 5
#endif

#define MOD 1000000007

#define MAXN 4000
#define MAXD 1000000000000000000LL
#define MAXL 100000
#define MAXM 100000
#define MAXQ 500

#if TASK == 1

#undef MAXD
#define MAXD 2000

#elif TASK == 2

#undef MAXN
#define MAXN 10

#elif TASK == 3

#undef MAXN
#define MAXN 1500
#undef MAXD
#define MAXD 50000

#elif TASK == 4

#undef MAXN
#define MAXN 100
#define LOGMAXD 60

#endif

#include <cstdio>

int N, M, Q;
// number of tourist attractions - these represent loops in the graph
int loop[MAXN];
int path[MAXM][2];
long long query[MAXQ];
int result[MAXQ];

void solve();
int main()
{
    scanf("%d%d%d", &N, &M, &Q);
    for (int i = 0; i < N; i++)
	scanf("%d", &loop[i]);
    for (int i = 0; i < M; i++)
    {
	scanf("%d%d", &path[i][0], &path[i][1]);
	path[i][0]--;
	path[i][1]--;
    }
    for (int i = 0; i < Q; i++)
	scanf("%lld", &query[i]);
    solve();
    for (int i = 0; i < Q; i++)
	printf("%d\n", result[i]);
}

#if TASK == 1 || TASK == 3

// Compute answers for small D using dynamic programming.
// Let dp[i][j] = number of ways to reach city i in j days
// Then dp[i][j+1] = loop[i]*dp[i][j] + sum_{roads from k to i}(dp[k][j])
//
void solve_small(int answer[], int maxd)
{
    int cur[MAXN] = {1};
    for (int D = 0; D < maxd; D++)
    {
	int next[MAXN];
	for (int i = 0; i < N; i++)
	    next[i] = 1LL*cur[i]*loop[i] % MOD;
	for (int i = 0; i < M; i++)
	{
	    next[path[i][1]] += cur[path[i][0]];
	    next[path[i][1]] %= MOD;
	}
	for (int i = 0; i < N; i++)
	    cur[i] = next[i];
	answer[D+1] = cur[N-1];
    }
}

#endif

#if TASK == 2 || TASK == 4

// standard matrix*matrix multiplication
// res = a*b
void mul(int res[MAXN][MAXN], const int a[MAXN][MAXN], const int b[MAXN][MAXN])
{
    int tmp[MAXN][MAXN] = {{}};
    for (int i = 0; i < MAXN; i++)
    for (int j = 0; j < MAXN; j++)
    for (int k = 0; k < MAXN; k++)
    {
	tmp[i][k] = (1LL*a[i][j]*b[j][k]+tmp[i][k]) % MOD;
    }
    for (int i = 0; i < MAXN; i++)
    for (int j = 0; j < MAXN; j++)
	res[i][j] = tmp[i][j];
}

// standard matrix*vector multiplication
// res = a*b
void mul(int res[MAXN], const int a[MAXN][MAXN], const int b[MAXN])
{
    int tmp[MAXN] = {};
    for (int i = 0; i < MAXN; i++)
    for (int j = 0; j < MAXN; j++)
    {
	tmp[i] = (1LL*a[i][j]*b[j]+tmp[i]) % MOD;
    }
    for (int i = 0; i < MAXN; i++)
	res[i] = tmp[i];
}

// standard matrix exponentiation
// res = base ^ expo
void pow(int res[MAXN][MAXN], const int base[MAXN][MAXN], long long expo)
{
    int sq[MAXN][MAXN];
    for (int i = 0; i < MAXN; i++)
    for (int j = 0; j < MAXN; j++)
    {
	res[i][j] = (i==j);
	sq[i][j] = base[i][j];
    }
    while (expo)
    {
	if (expo & 1)
	    mul(res, res, sq);
	mul(sq, sq, sq);
	expo >>= 1;
    }
}

#endif

// Basic dynamic-programming solution.
//
// complexity: O(MAXD * M)
#if TASK == 1

void solve()
{
    int answer[MAXD+1];
    solve_small(answer, MAXD);
    for (int i = 0; i < Q; i++)
	result[i] = answer[query[i]];
}

#endif

// Basic matrix solution
// 
// Construct a graph with N vertices, and an edge from a to b if there is a road from a to b.
// Add self-loops to each vertex with multiplicity equal to the number of tourist attractions.
// Let A be the adjacency matrix of this graph.
// Compute A^D for each query. Answer is in the corner.
//
// complexity: O(N^3 * Q * log(MAXD))
#if TASK == 2

void solve()
{
    int matrix[MAXN][MAXN] = {{}};
    for (int i = 0; i < N; i++)
	matrix[i][i] = loop[i];
    for (int i = 0; i < M; i++)
	matrix[path[i][1]][path[i][0]] = 1;
    for (int i = 0; i < Q; i++)
    {
	int res[MAXN][MAXN];
	pow(res, matrix, query[i]);
	result[i] = res[N-1][0];
    }
}

#endif

// Linear recurrence solution
//
// For small D, use dynamic programming as in task 1.
// Then use the characteristic polynomial of the adjacency matrix to find a linear recurrence.
// Then use the linear recurrence to compute answers for larger D.
//
// complexity: O(N*M + N*MAXD)
#if TASK == 3

void solve()
{

    int answer[MAXD+1] = {};
    // compute answers for D < N
    solve_small(answer, N-1);

    // compute linear recurrence via characteristic polynomial
    int poly[MAXN+1] = {MOD-1};
    for (int i = 0; i < N; i++)
    {
	for (int j = i; j >= 0; j--)
	{
	    poly[j+1] = (poly[j+1] + poly[j]) % MOD;
	    poly[j] = 1LL*poly[j]*(MOD-loop[i]) % MOD;
	}
    }

    // apply linear recurrence formula to get answers for larger D
    for (int D = N; D <= MAXD; D++)
    {
	answer[D] = 0;
	for (int i = 0; i < N; i++)
	    answer[D] = (1LL*poly[i]*answer[D-N+i] + answer[D]) % MOD;
    }

    for (int i = 0; i < Q; i++)
	result[i] = answer[query[i]];
}

#endif

// Advanced matrix solution
//
// First, compute A^2^i for i from 0 to log(MAXD)
// Let v be the vector {1 0 ... 0}
// Our goal is to compute (A^D)*v for each D.
// Write D as a sum of powers of 2, for example 13 = 1 + 4 + 8.
// We're used to computing A^13 = A^8 * A^4 * A^1, but this requires matrix*matrix multiplications
// Instead, we only need matrix*vector multiplications.
// For example, (A^13)*v can be computed as A^8 * (A^4 * (A^1 * v)), and the matrices are already computed.
//
// complexity: O(N^3 * log(MAXD) + N^2 * Q * log(MAXD))
#if TASK == 4

int matrix[LOGMAXD][MAXN][MAXN];
void solve()
{
    for (int i = 0; i < N; i++)
	matrix[0][i][i] = loop[i];
    for (int i = 0; i < M; i++)
	matrix[0][path[i][1]][path[i][0]] = 1;
    for (int i = 1; i < LOGMAXD; i++)
	mul(matrix[i], matrix[i-1], matrix[i-1]);
    for (int i = 0; i < Q; i++)
    {
	int res[MAXN] = {1};
	for (int bit = 0; bit < LOGMAXD; bit++)
	{
	    if ((query[i]>>bit) & 1)
	    {
	        mul(res, matrix[bit], res);
	    }
	}
	result[i] = res[N-1];
    }
}

#endif

// Jordan decomposition solution
//
// Since Chef only moves from lower numbered cities to higher numbered cities, the adjacency matrix is triangular.
// It follows that the eigenvalues (let's call them E_i) are the values on the diagonal, and therefore a Jordan 
//  decomposition exists.
// Denote m_i to be the multiplicity of a given eigenvalue - 0 for the first occurrence, 1 for the next, and so on.
// The existence of a Jordan decomposition implies constants C_{j,i} exist such that the 
//  number of ways to reach city j in D days is given by sum_{i from 1 to N}((D choose m_i)E_i^(D-m_i) * C_{j,i}).
// We only need to compute these constants.
//
// complexity: O(N * M + N * Q * log(MAXD))
#if TASK == 5

int add(int x, int y)
{
    return (x+y)%MOD;
}

int sub(int x, int y)
{
    return add(x, MOD-y);
}

int mul(int x, int y)
{
    return 1LL*x*y%MOD;
}

int power(int a, int b)
{
    int res = 1;
    int sq = a;
    for (; b; b >>= 1)
    {
	if (b&1) res = mul(res, sq);
	sq = mul(sq,sq);
    }
    return res;
}

int matrix[MAXN][MAXN];
int poly[MAXN][MAXN];
int inv[MAXL+1];

void solve()
{
    // pre-compute multiplicative inverses of 1 through MAXL
    for (int i = 1; i <= MAXL; i++)
    {
	inv[i] = power(i, MOD-2);
    }
    // find duplicate eigenvalues and locate previous matching eigenvalue
    int duplicate[MAXN], prev[MAXN];
    for (int i = 0; i < N; i++)
    {
	duplicate[i] = 0;
	prev[i] = -1;
	for (int j = i-1; j >= 0; j--)
	{
	    if (loop[i] == loop[j])
	    {
	        duplicate[i] = duplicate[j]+1;
	        prev[i] = j;
	        break;
	    }
	}
    }
    for (int i = 0; i < N; i++)
	matrix[i][i] = loop[i];
    for (int i = 0; i < M; i++)
	matrix[path[i][1]][path[i][0]] = 1;
    // coefficient computation, one city at a time
    poly[0][0] = 1;
    for (int i = 1; i < N; i++)
    {
	// sum the inputs into this city
	for (int j = 0; j < i; j++)
	{
	    if (matrix[i][j])
	    {
	        for (int k = 0; k <= j; k++)
	            poly[i][k] = add(poly[i][k], poly[j][k]);
	    }
	}
	// compute the coefficients for this city
	int sum = 0;
	int first = i;
	for (int j = i-1; j >= 0; j--)
	{
	    if (loop[j] == loop[i])
	    {
	        poly[i][first] = poly[i][j];
	        first = j;
	    }
	    else
	    {
	        int inverse;
	        if (loop[j] > loop[i])
	            inverse = inv[sub(loop[j], loop[i])];
	        else
	            inverse = MOD-inv[sub(loop[i], loop[j])];
	        poly[i][j] = mul(poly[i][j], inverse);
	        if (duplicate[j])
	        {
	            poly[i][prev[j]] = sub(poly[i][prev[j]], poly[i][j]);
	        }
	        else
	        {
	            sum = sub(sum, poly[i][j]);
	        }
	    }
	}
	poly[i][first] = sum;
    }
    // now answer the queries
    for (int i = 0; i < Q; i++)
    {
	int choose[MAXN];
	choose[0] = 1;
	for (int j = 1; j < N; j++)
	{
	    choose[j] = mul(choose[j-1], mul(sub((query[i]+1)%MOD, j), inv[j]));
	}
	int res = 0;
	for (int j = 0; j < N; j++)
	{
	    if (query[i] >= duplicate[j])
	    {
	        int p = mul(power(loop[j], (query[i]-duplicate[j])%(MOD-1)), choose[duplicate[j]]);
	        res = add(res, mul(poly[N-1][j], p));
	    }
	}
	result[i] = res;
    }
}

#endif
Tester's Solution (uses berlekamp massey algorithm
#include <bits/stdc++.h>
#include <unordered_map>
using namespace std;
template<typename T = int> vector<T> create(size_t n){ return vector<T>(n); }
template<typename T, typename... Args> auto create(size_t n, Args... args){ return vector<decltype(create<T>(args...))>(n, create<T>(args...)); }
template<typename T = int, T mod = 1'000'000'007, typename U = long long>
struct umod{
    T val;
    umod(): val(0){}
    umod(U x){ x %= mod; if(x < 0) x += mod; val = x;}
    umod& operator += (umod oth){ val += oth.val; if(val >= mod) val -= mod; return *this; }
    umod& operator -= (umod oth){ val -= oth.val; if(val < 0) val += mod; return *this; }
    umod& operator *= (umod oth){ val = ((U)val) * oth.val % mod; return *this; }
    umod& operator /= (umod oth){ return *this *= oth.inverse(); }
    umod& operator ^= (U oth){ return *this = pwr(*this, oth); }
    umod operator + (umod oth) const { return umod(*this) += oth; }
    umod operator - (umod oth) const { return umod(*this) -= oth; }
    umod operator * (umod oth) const { return umod(*this) *= oth; }
    umod operator / (umod oth) const { return umod(*this) /= oth; }
    umod operator ^ (long long oth) const { return umod(*this) ^= oth; }
    bool operator < (umod oth) const { return val < oth.val; }
    bool operator > (umod oth) const { return val > oth.val; }
    bool operator <= (umod oth) const { return val <= oth.val; }
    bool operator >= (umod oth) const { return val >= oth.val; }
    bool operator == (umod oth) const { return val == oth.val; }
    bool operator != (umod oth) const { return val != oth.val; }
    umod pwr(umod a, U b) const { umod r = 1; for(; b; a *= a, b >>= 1) if(b&1) r *= a; return r; }
    umod inverse() const {
	U a = val, b = mod, u = 1, v = 0;
	while(b){
	    U t = a/b;
	    a -= t * b; swap(a, b);
	    u -= t * v; swap(u, v);
	}
	if(u < 0)
	    u += mod;
	return u;
    }
};
using U = umod<>;
struct cd {
	double x, y;
	cd(): x(0), y(0){}
	cd(const double &x, const double &y): x(x), y(y){}
	inline double real(){ return x; }
	inline double imag(){ return y; }
};
inline cd operator + (const cd &a, const cd &b){ return cd(a.x + b.x, a.y + b.y); }
inline cd operator - (const cd &a, const cd &b){ return cd(a.x - b.x, a.y - b.y); }
inline cd operator * (const cd &a, const cd &b){ return cd(a.x * b.x - a.y * b.y, a.x * b.y + a.y * b.x); }
inline cd conj(const cd &a){ return cd(a.x, -a.y); }
// reference https://github.com/Nisiyama-Suzune/TGoP/blob/master/src/number/int_fft.cpp
template <int L = 15, int MOD = 1'000'000'007>
struct int_fft {

	static const int BN = 1<<13;
	int N;
	const double PI = acos(-1);

	using complex = cd;

	int MASK; complex w[BN];
	int rev[BN];

	int_fft (int N): N(N) {
		MASK = (1 << L) - 1;
		for (int i = 0; i < N; ++i)
			w[i] = complex (cos(2 * i * PI / N), sin(2 * i * PI / N));
		for (int i = 1, j = 0; i < N - 1; ++i) {
			for (int s = N; j ^= s >>= 1, ~j & s;);
			rev[i] = j;
		}
	}

	void FFT (complex p[], int n) {
		for(int i = 1; i < n - 1; i++) if(i < rev[i]) swap(p[i], p[rev[i]]);
		for (int d = 0; (1 << d) < n; ++d) {
			int m = 1 << d, m2 = m * 2, rm = n >> (d + 1);
			for (int i = 0; i < n; i += m2) {
				for (int j = 0; j < m; ++j) {
					complex &p1 = p[i + j + m], &p2 = p[i + j];
					complex t = w[rm * j] * p1;
					p1 = p2 - t, p2 = p2 + t;
				}
			}
		}
	}

	complex A[BN], B[BN], C[BN], D[BN];

	void solve (int a[], int b[]) {
		for (int i = 0; i < N; ++i) {
			A[i] = complex (a[i] >> L, a[i] & MASK);
			B[i] = complex (b[i] >> L, b[i] & MASK);
		}
		FFT(A, N), FFT(B, N);
		for (int i = 0; i < N; ++i) {
			int j = (N - i) % N;
			complex da = (A[i] - conj(A[j])) * complex(0, -0.5),
					db = (A[i] + conj(A[j])) * complex(0.5, 0),
					dc = (B[i] - conj(B[j])) * complex(0, -0.5),
					dd = (B[i] + conj(B[j])) * complex(0.5, 0);
			C[j] = da * dd + da * dc * complex(0, 1);
			D[j] = db * dd + db * dc * complex(0, 1);
		}
		FFT(C, N), FFT(D, N);
		for (int i = 0; i < N; ++i) {
			long long da = (long long)(C[i].imag() / N + 0.5) % MOD,
					db = (long long)(C[i].real() / N + 0.5) % MOD,
					dc = (long long)(D[i].imag() / N + 0.5) % MOD,
					dd = (long long)(D[i].real() / N + 0.5) % MOD;
			a[i] = ((dd << (L * 2)) + ((db + dc) << L) + da) % MOD;
		}
	}

};
// reference - https://github.com/Nisiyama-Suzune/LMR/blob/master/src/mathematics/recurrence-relation/berlekamp-massey-algorithm.cpp
template<typename T>
struct berlekamp {
	struct poly {
		vector<T> a;
		poly(){ a.clear(); }
		poly(vector<T> & a): a(a){}
		int length() const {
			return a.size();
		}
		poly move(int d){
			vector<T> n(d, 0);
			n.insert(n.end(), a.begin(), a.end());
			return poly(n);
		}
		T calc(vector<T> &d, int pos){
			T ret = 0;
			for(int i = 0; i < (int)a.size(); i++)
				ret += d[pos - i] * a[i];
			return ret;
		}
		poly operator - (const poly &b){
			vector<T> n(max(length(), b.length()), 0);
			for(int i = 0; i < (int)n.size(); i++){
				T va = i < length() ? a[i] : 0;
				T vb = i < b.length() ? b.a[i] : 0;
				n[i] = va - vb;
			}
			return poly(n);
		}
	};
	poly sho(const T &c, const poly &p){
		vector<T> n(p.length());
		for(int i = 0; i < (int)n.size(); i++)
			n[i] = p.a[i] * c;
		return poly(n);
	}
	vector<T> solve(vector<T> a){
		int n = a.size();
		poly s, b;
		s.a.push_back(1);
		b.a.push_back(1);
		T ld = 1;
		for(int i = 0, j = -1; i < n; i++){
			T d = s.calc(a, i);
			if(d > 0){
				if((s.length() - 1) * 2 <= i){
					poly ob = b; b = s;
					s = s - sho((d / ld), ob.move(i - j));
					j = i;
					ld = d;
				} else {
					s = s - sho((d / ld), b.move(i - j));
				}
			}
		}
		return s.a;
	}
};
using poly = vector<U>;
poly prefix(poly x, int sz){
	int n = min((int)x.size(), sz);
	x.resize(n);
	return x;
}
poly reverse(poly x, int len = 0){
	if(len) x.resize(len, 0);
	reverse(x.begin(), x.end());
	return x;
}
poly operator * (poly a, int x){
	for(auto & e : a) e *= x;
	return a;
}
poly operator * (const poly &a, const poly &b){
	if(a.size() * b.size() <= 10100){
		poly res(a.size() + b.size(), 0);
		for(int i = 0; i < a.size(); i++){
			for(int j = 0; j < b.size(); j++){
				res[i + j] += a[i] * b[j];
			}
		}
		return res;
	}
	int len = 1, expec = a.size() + b.size();
	while(len < expec) len *= 2;
	int_fft<> fft(len);
	int va[len], vb[len];
	for(int i = 0; i < len; i++) if(i < a.size()) va[i] = a[i].val; else va[i] = 0;
	for(int i = 0; i < len; i++) if(i < b.size()) vb[i] = b[i].val; else vb[i] = 0;
	fft.solve(va, vb);
	poly res;
	for(int i = 0; i < a.size() + b.size(); i++) res.push_back(va[i]);
	while(res.size() && res.back() == 0) res.pop_back();
	return res;
}
poly operator - (poly a, poly b){
	poly res(max(a.size(), b.size()), 0);
	for(int i = 0; i < a.size(); i++) res[i] = a[i];
	for(int i = 0; i < b.size(); i++) res[i] -= b[i];
	while(res.size() && res.back() == 0) res.pop_back();
	return res;
}
poly inverse(poly x){
	poly ret = {x[0].inverse()};
	int n = x.size();
	for(int i = 1; i < n; i <<= 1){
		ret = prefix(ret * 2 - ret * ret * prefix(x, i << 1), i << 1);
	}
	return prefix(ret, n);
}
vector<poly> bins;
poly rec, inv, mod;
poly div(poly base){
	if(base.size() < mod.size()){
		base.clear();
		return base;
	}
	int n = base.size() - mod.size() + 1;
	return reverse(prefix((prefix(reverse(base), n) * prefix(inv, n)), n), n);
}
void build(){
	inv = inverse(reverse(mod));
	poly x = {0, 1};
	bins.push_back(x);
	for(int i = 1; i < 60; i++){
		poly c = bins[i - 1];
		c = c * c;
		c = c - div(c) * mod;
		bins.push_back(c);
	}
}
poly get(long long N){
	poly ret{1};
	for(int i = 0; i < 60; i++){
		if(N&(1ll<<i)){
			ret = ret * bins[i];
			ret = ret - div(ret) * mod;
		}
	}
	return ret;
}
const int M = 4040;
U ways[M][2 * M + 1];
vector<int> to[M];
int l[M];
const int T = 555 * 60;
int trie[T][2], nds = 1;
vector<int> qq[T];
int ans[T];
vector<poly> now;
void insert(long long val, int id){
	int cur = 0;
	for(int i = 0; i < 60; i++){
		int b = (val>>i)&1;
		if(trie[cur][b] == 0) trie[cur][b] = nds++;
		cur = trie[cur][b];
	}
	qq[cur].push_back(id);
}
void solve(int cur){
	if(now.size() == 61){
		U sum = 0;
		for(int i = 0; i < (int) now.back().size(); i++){
			sum += now.back()[i] * rec[i];
		}
		for(int id : qq[cur]) ans[id] = sum.val;
	}
	for(int i = 0; i < 2; i++){
		if(trie[cur][i]){
			if(i){
				poly p = now.back();
				p = p * bins[now.size() - 1];
				p = p - div(p) * mod;
				now.push_back(p);
			} else {
				now.push_back(now.back());
			}
			solve(trie[cur][i]);
			now.pop_back();
		}
	}
}
int main(){
	ios::sync_with_stdio(false);
	cin.tie(0);
	int n, m, q; cin >> n >> m >> q;
	for(int i = 0; i < n; i++) cin >> l[i];
	for(int i = 0; i < m; i++){
		int u, v; cin >> u >> v; u--; v--;
		to[u].push_back(v);
	}
	ways[n - 1][0] = 1;
	const int B = 2 * n;
	for(int i = n - 1; i >= 0; i--){
		for(int v : to[i]){
			for(int j = 0; j <= B; j++){
				ways[i][j + 1] += ways[v][j];
			}
		}
		for(int j = 0; j <= B; j++){
			ways[i][j + 1] += ways[i][j] * l[i];
		}
	}
	for(int j = 0; j <= B; j++) rec.push_back(ways[0][j]);
	berlekamp<U> bm;
	vector<U> trans = bm.solve(rec);
	trans = reverse(trans);
	mod = trans;
	build();
	// return 0;
	vector<long long> qs(q);
	for(int i = 0; i < q; i++){
		cin >> qs[i];
		if(qs[i] < rec.size()){
			ans[i] = rec[qs[i]].val;
			continue;
		}
		insert(qs[i], i);
	}
	now.push_back({1});
	solve(0);
	for(int i = 0; i < q; i++)
		cout << ans[i] << '\n';
	return 0;
}

Feel free to share your approach. Suggestions are welcomed as always. :slight_smile:

2 Likes