TERMIN - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: iceknight1093
Editorialist: iceknight1093

DIFFICULTY:

Medium

PREREQUISITES:

Dynamic programming, prefix sums

PROBLEM:

For a ternary array B of length M,

  • f(B) denotes the lexicographically minimum array that can be obtained by repeating the following operation:
    • Choose two adjacent elements, delete them, and insert their sum modulo 3 at that position.
  • \text{val}(B) = \sum_{i=1}^M B_i \cdot 3^{i-1}

Given an array A, compute \sum_{i=1}^N \sum_{j=i}^N \text{val}(f(A[i\ldots j])).

EXPLANATION:

First, let’s figure out what f(B) is for a fixed array B.

Observe that with repeated uses of the operation, we can essentially reduce some subarray of B to its sum (modulo 3, of course - since everything we’re doing with the array is modulo 3, all future talks of elements of B and their sums will implicitly assume we work modulo 3).
So, any final array is equivalent to a partition of the array into subarrays; meaning we can focus on finding this partition instead.

Further, the sum of array is itself a constant - and this can be either 0, 1, or 2 - so let’s analyze what the optimal final array will be in each case.

  • \text{sum} = 0: clearly the best thing to do is just reduce the array to [0], which is the lexicographically smallest non-empty array.
  • \text{sum}=1: the optimal solution will always look like 00\ldots 001. Here, it’s ideal for there to be as many zeros as possible in the prefix, to “delay” the appearance of the first non-zero value (which is always 1 here).
  • \text{sum}=2: the optimal solution will be either 00\ldots 002 or 00\ldots 00100\ldots 001. The latter is “better” for us because 1 is smaller than 2, but is not always achievable.

Now, it appears we’re going into casework territory, especially with the \text{sum} = 2 case.
However, it turns out we don’t really need anything complicated!

There is, in fact, a relatively simple greedy algorithm that computes f(B), which is as follows:

  • Let m denote the minimum sum of a non-empty prefix of B. There are two possibilities:
    1. m does not equal \text{sum}(B). In this case, cut out the smallest prefix of B with sum m and make that one of the subarrays in the partition; then solve for the remaining array.
    2. m does equal \text{sum}(B). In this case, it’s optimal to just take the entirety of B as the subarray, and end the process immediately.

The correctness of this greedy is not too hard to prove, and can be done using a simple exchange argument.


Let’s use the knowledge of the greedy to solve the original problem.
Define dp_i to be the sum of answers of all subarrays starting at index i.
We’ll compute dp_i for i from N down to 1.

For x = 0, 1, 2, define j_x to be the smallest index such that A[i\ldots, j_x] has sum x.

Now, looking at our greedy, observe that for any R \geq j_0, the optimal first move for [i, R] is to discard the prefix [i, j_0] and solve [j_0 + 1, R] recursively.
(The one exception is if the subarray itself has a sum of 0 in which we need to take the entire thing; but the value of [0] is just 0 anyway so we don’t really need to care about this case).

This means, for all R \geq j_0, the optimal solution will look like the optimal solution for [j_0 + 1, R], but with an additional 0 at the beginning.
Inserting a zero at the beginning simply multiplies the value by 3, so summing this up across all R \geq j_0 gives us a value of 3\cdot dp_{j_0 + 1} to start with.

This leaves only i \leq R \lt j_0 as endpoints to consider.
Note that in this range, every subarray sum starting from i is either 1 or 2.
In particular, if the sum equals 1, then as per our greedy algorithm the optimal solution is to just reduce the array to [1], which has a value of 1.
So, we add c_1 to the answer, where c_1 denotes the number of i \leq R \lt j_0 such that \text{sum}(A[i\ldots R]) = 1.

Finally, we need to deal with subarrays starting at i with sum 2 (and still i \leq R \lt j_0, of course).
There are now two possibilities for each subarray: [2] or [1, 0, 0, \ldots, 0, 1].

Looking at the greedy algorithm, the first is possible only if we haven’t encountered a prefix sum of 1 yet.
That is, for all i \leq R \lt \min(j_0, j_1), the answer is [2] (which has a value of 2, and so can easily be added to dp_i).

This leaves j_1 \leq R \lt j_0 such that [i, R] has a sum of 2.
For each of these, the optimal final array will be [1, 0, 0, \ldots, 0, 1].
Observe that the value of such an array is 3^k + 1 where k is the length of the array minus 1.
Our aim is to add all these values to dp_i.

Let’s split the expression into 3^k and 1. The latter is added once for each subarray and its contribution is easily found, which leaves the former.

Observe that based on the greedy, for each prefix (starting at i) with sum 2 in this range, we want to add 3^k where k denotes the number of prefixes before it with sum 1 (after all, this is what corresponds to repeatedly choosing the smallest available prefix to us).

This can be done with the help of prefix sums.
Let’s define an auxiliary array b, where b_i = 3^k means there are k occurrences of the sum A_1 + A_2 + \ldots + A_i - 1 as a prefix before i.
Then, note that the value we’re looking for is exactly the sum of b_i in the range [j_1, j_0], though restricted to only indices with the appropriate prefix sum.
Of course, every value added this way will have “extra” powers of 3 in them, coming from indices before i: but this extra will be the same for all indices so it can just be divided out of the sum.

TIME COMPLEXITY:

\mathcal{O}(N) or \mathcal{O}(N\log{MOD}) per testcase.

CODE:

Tester's code (C++)
#include<bits/stdc++.h>
using namespace std;
using i64 = long long;
// #define int long long
#define sz(x) (int)(x).size()

template <typename T>
T inverse(T a, T m) {
    T u = 0, v = 1;
    while (a != 0) {
        T t = m / a;
        m -= t * a; swap(a, m);
        u -= t * v; swap(u, v);
    }
    assert(m == 1);
    return u;
}

template <typename T>
class Modular {
public:
    using Type = typename decay<decltype(T::value)>::type;

    constexpr Modular() : value() {}
    template <typename U>
    Modular(const U& x) {
        value = normalize(x);
    }

    template <typename U>
    static Type normalize(const U& x) {
        Type v;
        if (-mod() <= x && x < mod()) v = static_cast<Type>(x);
        else v = static_cast<Type>(x % mod());
        if (v < 0) v += mod();
        return v;
    }

    const Type& operator()() const { return value; }
    template <typename U>
    explicit operator U() const { return static_cast<U>(value); }
    constexpr static Type mod() { return T::value; }

    Modular& operator+=(const Modular& other) { if ((value += other.value) >= mod()) value -= mod(); return *this; }
    Modular& operator-=(const Modular& other) { if ((value -= other.value) < 0) value += mod(); return *this; }
    template <typename U> Modular& operator+=(const U& other) { return *this += Modular(other); }
    template <typename U> Modular& operator-=(const U& other) { return *this -= Modular(other); }
    Modular& operator++() { return *this += 1; }
    Modular& operator--() { return *this -= 1; }
    Modular operator++(int) { Modular result(*this); *this += 1; return result; }
    Modular operator--(int) { Modular result(*this); *this -= 1; return result; }
    Modular operator-() const { return Modular(-value); }

    template <typename U = T>
    typename enable_if<is_same<typename Modular<U>::Type, int>::value, Modular>::type & operator*=(const Modular& rhs) {
        value = normalize(static_cast<int64_t>(value) * static_cast<int64_t>(rhs.value));
        return *this;
    }
    template <typename U = T>
    typename enable_if<is_same<typename Modular<U>::Type, long long>::value, Modular>::type & operator*=(const Modular& rhs) {
        long long q = static_cast<long long>(static_cast<long double>(value) * rhs.value / mod());
        value = normalize(value * rhs.value - q * mod());
        return *this;
    }
    template <typename U = T>
    typename enable_if < !is_integral<typename Modular<U>::Type>::value, Modular >::type & operator*=(const Modular& rhs) {
        value = normalize(value * rhs.value);
        return *this;
    }

    Modular& operator/=(const Modular& other) { return *this *= Modular(inverse(other.value, mod())); }

    friend const Type& abs(const Modular& x) { return x.value; }

    template <typename U>
    friend bool operator==(const Modular<U>& lhs, const Modular<U>& rhs);

    template <typename U>
    friend bool operator<(const Modular<U>& lhs, const Modular<U>& rhs);

    template <typename V, typename U>
    friend V& operator>>(V& stream, Modular<U>& number);

private:
    Type value;
};

template <typename T> bool operator==(const Modular<T>& lhs, const Modular<T>& rhs) { return lhs.value == rhs.value; }
template <typename T, typename U> bool operator==(const Modular<T>& lhs, U rhs) { return lhs == Modular<T>(rhs); }
template <typename T, typename U> bool operator==(U lhs, const Modular<T>& rhs) { return Modular<T>(lhs) == rhs; }

template <typename T> bool operator!=(const Modular<T>& lhs, const Modular<T>& rhs) { return !(lhs == rhs); }
template <typename T, typename U> bool operator!=(const Modular<T>& lhs, U rhs) { return !(lhs == rhs); }
template <typename T, typename U> bool operator!=(U lhs, const Modular<T>& rhs) { return !(lhs == rhs); }

template <typename T> bool operator<(const Modular<T>& lhs, const Modular<T>& rhs) { return lhs.value < rhs.value; }

template <typename T> Modular<T> operator+(const Modular<T>& lhs, const Modular<T>& rhs) { return Modular<T>(lhs) += rhs; }
template <typename T, typename U> Modular<T> operator+(const Modular<T>& lhs, U rhs) { return Modular<T>(lhs) += rhs; }
template <typename T, typename U> Modular<T> operator+(U lhs, const Modular<T>& rhs) { return Modular<T>(lhs) += rhs; }

template <typename T> Modular<T> operator-(const Modular<T>& lhs, const Modular<T>& rhs) { return Modular<T>(lhs) -= rhs; }
template <typename T, typename U> Modular<T> operator-(const Modular<T>& lhs, U rhs) { return Modular<T>(lhs) -= rhs; }
template <typename T, typename U> Modular<T> operator-(U lhs, const Modular<T>& rhs) { return Modular<T>(lhs) -= rhs; }

template <typename T> Modular<T> operator*(const Modular<T>& lhs, const Modular<T>& rhs) { return Modular<T>(lhs) *= rhs; }
template <typename T, typename U> Modular<T> operator*(const Modular<T>& lhs, U rhs) { return Modular<T>(lhs) *= rhs; }
template <typename T, typename U> Modular<T> operator*(U lhs, const Modular<T>& rhs) { return Modular<T>(lhs) *= rhs; }

template <typename T> Modular<T> operator/(const Modular<T>& lhs, const Modular<T>& rhs) { return Modular<T>(lhs) /= rhs; }
template <typename T, typename U> Modular<T> operator/(const Modular<T>& lhs, U rhs) { return Modular<T>(lhs) /= rhs; }
template <typename T, typename U> Modular<T> operator/(U lhs, const Modular<T>& rhs) { return Modular<T>(lhs) /= rhs; }

template<typename T, typename U>
Modular<T> power(const Modular<T>& a, const U& b) {
    assert(b >= 0);
    Modular<T> x = a, res = 1;
    U p = b;
    while (p > 0) {
        if (p & 1) res *= x;
        x *= x;
        p >>= 1;
    }
    return res;
}

template <typename T>
bool IsZero(const Modular<T>& number) {
    return number() == 0;
}

template <typename T>
string to_string(const Modular<T>& number) {
    return to_string(number());
}

// U == std::ostream? but done this way because of fastoutput
template <typename U, typename T>
U& operator<<(U& stream, const Modular<T>& number) {
    return stream << number();
}

// U == std::istream? but done this way because of fastinput
template <typename U, typename T>
U& operator>>(U& stream, Modular<T>& number) {
    typename common_type<typename Modular<T>::Type, long long>::type x;
    stream >> x;
    number.value = Modular<T>::normalize(x);
    return stream;
}

/*
using ModType = int;

struct VarMod { static ModType value; };
ModType VarMod::value;
ModType& md = VarMod::value;
using Mint = Modular<VarMod>;
*/

constexpr int md = 998244353;
using Mint = Modular<std::integral_constant<decay<decltype(md)>::type, md>>;

vector<Mint> fact(1, 1);
vector<Mint> inv_fact(1, 1);

Mint C(int n, int k) {
    if (k < 0 || k > n) {
        return 0;
    }
    while ((int) fact.size() < n + 1) {
        fact.push_back(fact.back() * (int) fact.size());
        inv_fact.push_back(1 / fact.back());
    }
    return fact[n] * inv_fact[k] * inv_fact[n - k];
}

int32_t main() {
    std::ios::sync_with_stdio(false);
    std::cin.tie(0);
    int T;
    cin >> T;
    while (T--) {
        int n;
        cin >> n;
        vector<int> a(n);
        for (auto &x : a) cin >> x;
        vector<Mint> sm(3, Mint(0));
        vector<Mint> contr(n, Mint(0));
        Mint ans = 0;
        vector<Mint> spc(3, Mint(0));
        vector<Mint> add(3, Mint(0));
        // vector<Mint> two(3, Mint(0));
        Mint two = 0;
        for (int i = 0; i < n; i++) {
            vector<Mint> nsm(3, Mint(0));
            vector<Mint> nspc(3, Mint(0));
            vector<Mint> nadd(3, Mint(0));
            if (a[i] == 0) {
                nsm = sm;
                nsm[0] *= 3;//0 -> 0
                nsm[0] += 3;
                nspc = spc;
                nadd = add;
                nspc[0] *= 3;
                if (i) contr[i] = contr[i - 1]; //old contribution
                //two will be unaffected
            }
            else if (a[i] == 2) {
                nsm[0] = sm[1] * 3;
                nsm[1] = sm[2];
                nsm[2] = sm[0];
                nsm[2]++;
                contr[i] = (nsm[2]) * 2 + (nsm[1]);///00002 case
                nspc[2] = spc[0];
                nspc[1] = spc[2];
                nspc[0] = spc[1] * 3; //make 0
                nadd[2] = add[0];
                nadd[1] = add[2];
                nadd[0] = add[1];
                nspc[0] += two * 3;
                nadd[0] += two;
                two = nsm[2]; //all these end in a '2'
            }
            else if (a[i] == 1) {
                nsm[0] = sm[2] * 3;
                nsm[1] = sm[0];
                nsm[2] = sm[1];
                nsm[1]++;
                contr[i] = nsm[1];//0001 case
                contr[i] += spc[0] + add[0];
                // cout << spc[0] <<" "
                // claim: 2 can only appear as 00010001
                // okay, now how to solve this ?
                // this adds prev 1s answer
                // cout << nsm[1] << " " << spc << " " << add << '\n';
                // contr[i] += spc + add;
                // spc = nsm[1] * 3;//made a 1
                // add = nsm[1];//store 1 sum

                // we also want to add the contribution this guy brings, skipping a 1
                nspc[2] = spc[1];
                nspc[1] = spc[0];
                nspc[0] = spc[2];
                nadd[2] = add[1];
                nadd[1] = add[0];
                nadd[0] = add[2];
                nspc[0] += nsm[1] * 3;
                nadd[0] += nsm[1];
                //two gets zeroed
                two = 0;
            }
            nspc[2] = nadd[2] = 0;
            swap(nspc, spc);
            swap(nadd, add);
            swap(sm, nsm);
            ans += contr[i];
            // cout << ans << '\n';
        }
        cout << ans << '\n';
    }
}

Editorialist's code (C++)
// #include <bits/allocator.h>
// #pragma GCC optimize("O3,unroll-loops")
// #pragma GCC target("avx2,bmi,bmi2,lzcnt,popcnt")
#include "bits/stdc++.h"
using namespace std;
using ll = long long int;
mt19937_64 RNG(chrono::high_resolution_clock::now().time_since_epoch().count());
 
/**
 * Integers modulo p, where p is a prime
 * Source: Aeren (modified from tourist?)
 *         Modmul for 64-bit mod from kactl:ModMulLL
 * Works with p < 7.2e18 with x87 80-bit long double, and p < 2^52 ~ 4.5e12 with 64-bit
 */
template<typename T>
struct Z_p{
	using Type = typename decay<decltype(T::value)>::type;
	static vector<Type> MOD_INV;
	constexpr Z_p(): value(){ }
	template<typename U> Z_p(const U &x){ value = normalize(x); }
	template<typename U> static Type normalize(const U &x){
		Type v;
		if(-mod() <= x && x < mod()) v = static_cast<Type>(x);
		else v = static_cast<Type>(x % mod());
		if(v < 0) v += mod();
		return v;
	}
	const Type& operator()() const{ return value; }
	template<typename U> explicit operator U() const{ return static_cast<U>(value); }
	constexpr static Type mod(){ return T::value; }
	Z_p &operator+=(const Z_p &otr){ if((value += otr.value) >= mod()) value -= mod(); return *this; }
	Z_p &operator-=(const Z_p &otr){ if((value -= otr.value) < 0) value += mod(); return *this; }
	template<typename U> Z_p &operator+=(const U &otr){ return *this += Z_p(otr); }
	template<typename U> Z_p &operator-=(const U &otr){ return *this -= Z_p(otr); }
	Z_p &operator++(){ return *this += 1; }
	Z_p &operator--(){ return *this -= 1; }
	Z_p operator++(int){ Z_p result(*this); *this += 1; return result; }
	Z_p operator--(int){ Z_p result(*this); *this -= 1; return result; }
	Z_p operator-() const{ return Z_p(-value); }
	template<typename U = T>
	typename enable_if<is_same<typename Z_p<U>::Type, int>::value, Z_p>::type &operator*=(const Z_p& rhs){
		#ifdef _WIN32
		uint64_t x = static_cast<int64_t>(value) * static_cast<int64_t>(rhs.value);
		uint32_t xh = static_cast<uint32_t>(x >> 32), xl = static_cast<uint32_t>(x), d, m;
		asm(
			"divl %4; \n\t"
			: "=a" (d), "=d" (m)
			: "d" (xh), "a" (xl), "r" (mod())
		);
		value = m;
		#else
		value = normalize(static_cast<int64_t>(value) * static_cast<int64_t>(rhs.value));
		#endif
		return *this;
	}
	template<typename U = T>
	typename enable_if<is_same<typename Z_p<U>::Type, int64_t>::value, Z_p>::type &operator*=(const Z_p &rhs){
		uint64_t ret = static_cast<uint64_t>(value) * static_cast<uint64_t>(rhs.value) - static_cast<uint64_t>(mod()) * static_cast<uint64_t>(1.L / static_cast<uint64_t>(mod()) * static_cast<uint64_t>(value) * static_cast<uint64_t>(rhs.value));
		value = normalize(static_cast<int64_t>(ret + static_cast<uint64_t>(mod()) * (ret < 0) - static_cast<uint64_t>(mod()) * (ret >= static_cast<uint64_t>(mod()))));
		return *this;
	}
	template<typename U = T>
	typename enable_if<!is_integral<typename Z_p<U>::Type>::value, Z_p>::type &operator*=(const Z_p &rhs){
		value = normalize(value * rhs.value);
		return *this;
	}
	template<typename U>
	Z_p &operator^=(U e){
		if(e < 0) *this = 1 / *this, e = -e;
		Z_p res = 1;
		for(; e; *this *= *this, e >>= 1) if(e & 1) res *= *this;
		return *this = res;
	}
	template<typename U>
	Z_p operator^(U e) const{
		return Z_p(*this) ^= e;
	}
	Z_p &operator/=(const Z_p &otr){
		Type a = otr.value, m = mod(), u = 0, v = 1;
		if(a < (int)MOD_INV.size()) return *this *= MOD_INV[a];
		while(a){
			Type t = m / a;
			m -= t * a; swap(a, m);
			u -= t * v; swap(u, v);
		}
		assert(m == 1);
		return *this *= u;
	}
	template<typename U> friend const Z_p<U> &abs(const Z_p<U> &v){ return v; }
	Type value;
};
template<typename T> bool operator==(const Z_p<T> &lhs, const Z_p<T> &rhs){ return lhs.value == rhs.value; }
template<typename T, typename U, typename enable_if<is_integral<U>::value>::type* = nullptr> bool operator==(const Z_p<T>& lhs, U rhs){ return lhs == Z_p<T>(rhs); }
template<typename T, typename U, typename enable_if<is_integral<U>::value>::type* = nullptr> bool operator==(U lhs, const Z_p<T> &rhs){ return Z_p<T>(lhs) == rhs; }
template<typename T> bool operator!=(const Z_p<T> &lhs, const Z_p<T> &rhs){ return !(lhs == rhs); }
template<typename T, typename U, typename enable_if<is_integral<U>::value>::type* = nullptr> bool operator!=(const Z_p<T> &lhs, U rhs){ return !(lhs == rhs); }
template<typename T, typename U, typename enable_if<is_integral<U>::value>::type* = nullptr> bool operator!=(U lhs, const Z_p<T> &rhs){ return !(lhs == rhs); }
template<typename T> bool operator<(const Z_p<T> &lhs, const Z_p<T> &rhs){ return lhs.value < rhs.value; }
template<typename T> bool operator>(const Z_p<T> &lhs, const Z_p<T> &rhs){ return lhs.value > rhs.value; }
template<typename T> bool operator<=(const Z_p<T> &lhs, const Z_p<T> &rhs){ return lhs.value <= rhs.value; }
template<typename T> bool operator>=(const Z_p<T> &lhs, const Z_p<T> &rhs){ return lhs.value >= rhs.value; }
template<typename T> Z_p<T> operator+(const Z_p<T> &lhs, const Z_p<T> &rhs){ return Z_p<T>(lhs) += rhs; }
template<typename T, typename U, typename enable_if<is_integral<U>::value>::type* = nullptr> Z_p<T> operator+(const Z_p<T> &lhs, U rhs){ return Z_p<T>(lhs) += rhs; }
template<typename T, typename U, typename enable_if<is_integral<U>::value>::type* = nullptr> Z_p<T> operator+(U lhs, const Z_p<T> &rhs){ return Z_p<T>(lhs) += rhs; }
template<typename T> Z_p<T> operator-(const Z_p<T> &lhs, const Z_p<T> &rhs){ return Z_p<T>(lhs) -= rhs; }
template<typename T, typename U, typename enable_if<is_integral<U>::value>::type* = nullptr> Z_p<T> operator-(const Z_p<T>& lhs, U rhs){ return Z_p<T>(lhs) -= rhs; }
template<typename T, typename U, typename enable_if<is_integral<U>::value>::type* = nullptr> Z_p<T> operator-(U lhs, const Z_p<T> &rhs){ return Z_p<T>(lhs) -= rhs; }
template<typename T> Z_p<T> operator*(const Z_p<T> &lhs, const Z_p<T> &rhs){ return Z_p<T>(lhs) *= rhs; }
template<typename T, typename U, typename enable_if<is_integral<U>::value>::type* = nullptr> Z_p<T> operator*(const Z_p<T>& lhs, U rhs){ return Z_p<T>(lhs) *= rhs; }
template<typename T, typename U, typename enable_if<is_integral<U>::value>::type* = nullptr> Z_p<T> operator*(U lhs, const Z_p<T> &rhs){ return Z_p<T>(lhs) *= rhs; }
template<typename T> Z_p<T> operator/(const Z_p<T> &lhs, const Z_p<T> &rhs) { return Z_p<T>(lhs) /= rhs; }
template<typename T, typename U, typename enable_if<is_integral<U>::value>::type* = nullptr> Z_p<T> operator/(const Z_p<T>& lhs, U rhs) { return Z_p<T>(lhs) /= rhs; }
template<typename T, typename U, typename enable_if<is_integral<U>::value>::type* = nullptr> Z_p<T> operator/(U lhs, const Z_p<T> &rhs) { return Z_p<T>(lhs) /= rhs; }
template<typename T> istream &operator>>(istream &in, Z_p<T> &number){
	typename common_type<typename Z_p<T>::Type, int64_t>::type x;
	in >> x;
	number.value = Z_p<T>::normalize(x);
	return in;
}
template<typename T> ostream &operator<<(ostream &out, const Z_p<T> &number){ return out << number(); }
 
/*
using ModType = int;
struct VarMod{ static ModType value; };
ModType VarMod::value;
ModType &mod = VarMod::value;
using Zp = Z_p<VarMod>;
*/
 
// constexpr int mod = 1e9 + 7; // 1000000007
constexpr int mod = (119 << 23) + 1; // 998244353
// constexpr int mod = 1e9 + 9; // 1000000009
using Zp = Z_p<integral_constant<decay<decltype(mod)>::type, mod>>;
 
template<typename T> vector<typename Z_p<T>::Type> Z_p<T>::MOD_INV;
template<typename T = integral_constant<decay<decltype(mod)>::type, mod>>
void precalc_inverse(int SZ){
	auto &inv = Z_p<T>::MOD_INV;
	if(inv.empty()) inv.assign(2, 1);
	for(; inv.size() <= SZ; ) inv.push_back((mod - 1LL * mod / (int)inv.size() * inv[mod % (int)inv.size()]) % mod);
}
 
template<typename T>
vector<T> precalc_power(T base, int SZ){
	vector<T> res(SZ + 1, 1);
	for(auto i = 1; i <= SZ; ++ i) res[i] = res[i - 1] * base;
	return res;
}
 
template<typename T>
vector<T> precalc_factorial(int SZ){
	vector<T> res(SZ + 1, 1); res[0] = 1;
	for(auto i = 1; i <= SZ; ++ i) res[i] = res[i - 1] * i;
	return res;
}
 
int main()
{
    ios::sync_with_stdio(false); cin.tie(0);
 
    int t; cin >> t;
    while (t--) {
        int n; cin >> n;
        vector<int> a(n);
        for (int &x : a) cin >> x;
 
        vector<int> p(n+1);
        for (int i = 1; i <= n; ++i)
            p[i] = (p[i-1] + a[i-1]) % 3;
        array<vector<int>, 5> pos;
        for (int i = 0; i <= n; ++i)
            pos[p[i]].push_back(i);
        pos[3] = pos[0];
        pos[4] = pos[1];
 
        vector<array<Zp, 3>> powpre(n);
        vector<array<int, 3>> freqpre(n);
        {
            for (int i = 0; i < n; ++i) {
                if (i) powpre[i] = powpre[i-1], freqpre[i] = freqpre[i-1];
                powpre[i][p[i+1]] += Zp(3) ^ freqpre[i][(p[i+1]+2)%3];
                ++freqpre[i][p[i+1]];
            }
        }
 
        vector<array<int, 3>> nxt(n); // nxt[i][x] = smallest index j such that sum(a[i...j]) = x
        for (int i = n-1; i >= 0; --i) {
            nxt[i] = {n, n, n};
            if (i+1 < n) {
                for (int j = 0; j < 3; ++j) {
                    int x = (j + a[i]) % 3;
                    nxt[i][x] = nxt[i+1][j];
                }
            }
            nxt[i][a[i]] = i;
        }
 
        vector<Zp> dp(n+1);
        for (int i = n-1; i >= 0; --i) {
            auto [zero, one, two] = nxt[i];
            if (zero < n) { // Place a zero first
                dp[i] = 3*dp[zero+1];
            }
 
            
            // Now only consider [i, zero-1]
            // Just a 2: only as long as no one/two prefix is reached
            dp[i] += 2*(min(zero, one) - i);
 
            // Just a 1: everything with prefix 1
            // Taken care of later
 
            // 1000001
            if (one < zero) {
                int x = (p[i] + 2) % 3;
                Zp add = powpre[zero-1][x] - powpre[one][x];
                if (i) add /= Zp(3) ^ freqpre[i-1][(x + 2) % 3];
                dp[i] += add + zero - one;
            }
        }
 
        cout << accumulate(begin(dp), end(dp), Zp(0)) << '\n';
    }
}