INOI1502 - Editorial

ajit123q · August 29, 2020, 3:57pm

PROBLEM LINK:

Practice

Editorialist: Ajit Sharma Kasturi

DIFFICULTY:

EASY

PREREQUISITES:

Greatest Common Divisor (gcd) , Modular-Arithmetic, Dynamic-Programming (dp) , Sieve of Eratosthenes

PROBLEM:

Given a positive integer N and a number M, we need to find the number of strings of length N which are not periodic.

A string is any non-empty sequence of 0s and 1s. A string S is not periodic if there exists a string W such that S=WWW...(x\ times) i.e S = W^x for some x >= 2.

QUICK EXPLANATION:

The total number of strings of length i is 2^i.
In order to find the number of non-periodic strings of length i , we can subtract from 2^i the number of periodic strings of length i.
For every periodic string S, there always exists a string W such that S=W^x for an integer x>=2. So let’s focus on W which makes the string S.
In order to find the number of periodic strings of length i, for every j<i, we need to count how many non-periodic strings of length j exist which can make a string of length i and add them to the count of periodic strings of length i. (We are not adding periodic strings of length j as the strings of length i made by them would have been already considered by the non-periodic strings which make those periodic strings of length j).
This suggests a dp solution where dp[i] denotes the number of non-periodic strings of length i. We have dp[i]=2^i-number\_of\_periodic\_strings\_of\_length\_i. There are N states in this dp. We can precalculate 2^i(modulo \ M) for every i from 1 to N and store those values in an array.
The number of periodic strings of length i can be calculated by the method explained above and we can calculate dp[i] for every 1<=i<=N. We output dp[N] as the final answer.
Make sure to take modulo \ M at every step in the solution.

EXPLANATION:

Assumptions:

0-based indexing is used throughout the explanation.
W^x = WWW....W (x \ times).
For a string S, the function length(S) denotes the length of the string S.
For two strings S and W, if S=W^x for some x>=2 then we say S can be represented by W.
For a string S of length N, an index i is valid if \ 0<=i<N.

Let us consider a string S of length N. We say S has a period p if there exists a string W of length p such that S = W^x for some x>=2. Thus, every string S with period p has the following property: S[i]=S[i-p] for i>=p. It is easy to observe that any non-periodic string won’t have any period.

Now we have the following lemmas:

Lemma 1:

If the string S of length N has period p, then p must divisible by N.

Proof:

From the definition of period, we assume S=W^x for string W with length p and for x>=2. Comparing lengths on both sides, we get length(S)=length(W^x) . Therefore N = x*length(W)=x*p . Since x is an integer, it follows that p must be divisible by N.

Lemma 2:

If the string S of length N has periods p and q, then it also has a period of gcd(p,q).

Proof:

Without loss of generality, let us assume that p>q. If we were able to prove S has a period of p-q , then we can apply the same steps recursively and it follows directly from the Euclidean Algorithm that S has a period of gcd(p,q).

Since S has periods p and q, it follows that S[i]=S[i+p]=S[i-p] and
S[i]=S[i-q]=S[i+q].

(Ofcourse we need to consider valid indices for i,i+p,i-p, i+q, i-q ).

We need to prove S[i]=S[i-(p-q)] for all valid indices of i and i-(p-q).
If i>=p, then S[i]=S[i-p]=S[i-p+q]=S[i-(p-q)]
else if i<p \ and \ i>=p-q , then S[i]=S[i+q]=S[i+q-p]=S[i-(p-q)]

This implies that S[i]=S[i-(p-q)] for all i>=p-q.

Also, (p-q) is divisible by N, (since p and q are divisible by N).

Therefore, S has a period of p-q and it follows from the Euclidean algorithm that S also has a period of gcd(p,q).

Lemma 3:

Every periodic string S can be represented by some non-periodic string and this representation is unique.

Proof:

There always exists a non-periodic string W which satisfies the condition S=W^x for x>=2. (Think why this is true. Take the smallest period of S and find the string corresponding to this period. It must be non-periodic.)

Now, we need to prove S can be represented by a unique non-periodic string W with length p. Let us assume this is not the case and S can also be represented by another non-periodic string T with length q. Without loss of generality assume p<q. Now S has two periods p and q. By lemma 2, S also has a period of gcd(p,q). Since T is divisible by gcd(p,q) and q>gcd(p,q), T has a period of gcd(p,q) and is a periodic string. But we assumed initially that T is a non-periodic string. Thus, we arrived at a contradiction and it implies that our assumption is wrong.

Having proved the lemmas, let’s proceed to the main problem.

Firstly, we need to note that the total number of strings of length i is 2^i(the character at each position of the string can be 0 or 1).
Let us try to find the number of periodic strings of length i. By lemma 3 and lemma 1, we know that every periodic string S of length i can be represented uniquely by some non-periodic string T of length j divisible by i i.e, S= T^x where x=\frac{i}{j}.
If we get the number of periodic strings of length i as num_p, then we get the number of non-periodic strings num_{np} of length i as num_{np} = 2^i - num_p.
To find num_p we need to add for every j <i divisible by i the number of non-periodic strings of length j since they help in forming a unique periodic string of length i (lemma 3).
This suggests a dynamic programming(dp) solution. Let us consider a dp of N states where dp[i] denotes the number of non_periodic strings of length i. Here dp[i] = 2^i - num_p where num_p denotes the number of periodic strings of length i.
Thus, we get dp[i] = 2^i - \sum_{j<i \ and \ j \in divisors(i) } dp[j].
We can precalculate 2^i(modulo \ M) for every 1<=i<=N in O(N) and store it in an array, say pow2.
```
        pow2[0]=1;
        for(int i=0;i<n;i++) pow2[i]=(pow2[i-1]*2) % M; 
```

For calculating dp[i], we need to iterate over all divisors of i(excluding \ i). This can be done in O(\sqrt[]{i}) as follows:

                dp[i]=pow2[i];
                for(int j=2;j*j<i+1;j++)
                {
                             if(i%j==0) //Finding two factors j and i/j at the same time
                             {
                                        dp[i]-=dp[j]; //For the factor j
                                        if(j!=1 && i/j!=j)  // Taking care of (i/j not equal to i) and (i/j not equal to j)
                                        dp[i]-=dp[i/j];  // For the factor i/j
                             }
                }
                dp[i] = (dp[i] % M + M) % M; //Taking care of modulo M

After calculating dp[i] for all i from 1 to N, we output dp[N] as our final answer.

BASE CASE:

dp[1]=2 ( since binary strings of length 1 can’t have any period and there are two possible
strings - "0" and "1").

TIME COMPLEXITY:

Precalculating the pow2 array takes O(N) time. Calculating the dp array takes \sum_{i = 1}^{N}\normalsize\sqrt{i} which has an upper bound of \sum_{i = 1}^{N}\normalsize\sqrt{N}=N\normalsize\sqrt{N}.
Therefore, the time complexity of the above algorithm is O (N\normalsize\sqrt{N}).

SPACE COMPLEXITY:

Both pow2 array and dp array have N states. Therefore the space complexity of the above algorithm is O(N).

ALTERNATE EXPLANATION:

We can bring down the time complexity to O(N\log{N}) by precalculating proper divisors of every number using sieve of eratosthenes. The idea is to iterate over i from 1 to N and for each multiple of i(excluding \ i) <=N add i to their list of divisors.

Calculating the divisors takes time N*\sum_{i = 1}^{N}\frac{1}{i} which is O(N\log{N}). Similarly, storing the divisors takes O(N\log{N}) space.

SOLUTION:

Editorialist’s solution can be found here.

Please comment below if you have any questions, alternate solutions, or suggestions. This is my first editorial, so let me know if you like it.