PROBLEM LINK:
Editorialist: Ajit Sharma Kasturi
DIFFICULTY:
EASY
PREREQUISITES:
Greatest Common Divisor (gcd) , Modular-Arithmetic, Dynamic-Programming (dp) , Sieve of Eratosthenes
PROBLEM:
Given a positive integer N and a number M, we need to find the number of strings of length N which are not periodic.
A string is any non-empty sequence of 0s and 1s. A string S is not periodic if there exists a string W such that S=WWW...(x\ times) i.e S = W^x for some x >= 2.
QUICK EXPLANATION:
-
The total number of strings of length i is 2^i.
-
In order to find the number of non-periodic strings of length i , we can subtract from 2^i the number of periodic strings of length i.
-
For every periodic string S, there always exists a string W such that S=W^x for an integer x>=2. So let’s focus on W which makes the string S.
-
In order to find the number of periodic strings of length i, for every j<i, we need to count how many non-periodic strings of length j exist which can make a string of length i and add them to the count of periodic strings of length i.
(We are not adding periodic strings of length j as the strings of length i made by them would have been already considered by the non-periodic strings which make those periodic strings of length j).
-
This suggests a dp solution where dp[i] denotes the number of non-periodic strings of length i. We have dp[i]=2^i-number\_of\_periodic\_strings\_of\_length\_i. There are N states in this dp. We can precalculate 2^i(modulo \ M) for every i from 1 to N and store those values in an array.
-
The number of periodic strings of length i can be calculated by the method explained above and we can calculate dp[i] for every 1<=i<=N. We output dp[N] as the final answer.
-
Make sure to take modulo \ M at every step in the solution.
EXPLANATION:
Assumptions:
- 0-based indexing is used throughout the explanation.
- W^x = WWW....W (x \ times).
- For a string S, the function length(S) denotes the length of the string S.
- For two strings S and W, if S=W^x for some x>=2 then we say S can be represented by W.
- For a string S of length N, an index i is valid if \ 0<=i<N.
Let us consider a string S of length N. We say S has a period p if there exists a string W of length p such that S = W^x for some x>=2. Thus, every string S with period p has the following property: S[i]=S[i-p] for i>=p. It is easy to observe that any non-periodic string won’t have any period.
Now we have the following lemmas:
Lemma 1:
If the string S of length N has period p, then p must divisible by N.
Proof:
From the definition of period, we assume S=W^x for string W with length p and for x>=2. Comparing lengths on both sides, we get length(S)=length(W^x) . Therefore N = x*length(W)=x*p . Since x is an integer, it follows that p must be divisible by N.
Lemma 2:
If the string S of length N has periods p and q, then it also has a period of gcd(p,q).
Proof:
Without loss of generality, let us assume that p>q. If we were able to prove S has a period of p-q , then we can apply the same steps recursively and it follows directly from the Euclidean Algorithm that S has a period of gcd(p,q).
Since S has periods p and q, it follows that S[i]=S[i+p]=S[i-p] and
S[i]=S[i-q]=S[i+q].
(Ofcourse we need to consider valid indices for i,i+p,i-p, i+q, i-q ).
We need to prove S[i]=S[i-(p-q)] for all valid indices of i and i-(p-q).
If i>=p, then S[i]=S[i-p]=S[i-p+q]=S[i-(p-q)]
else if i<p \ and \ i>=p-q , then S[i]=S[i+q]=S[i+q-p]=S[i-(p-q)]
This implies that S[i]=S[i-(p-q)] for all i>=p-q.
Also, (p-q) is divisible by N, (since p and q are divisible by N).
Therefore, S has a period of p-q and it follows from the Euclidean algorithm that S also has a period of gcd(p,q).
Lemma 3:
Every periodic string S can be represented by some non-periodic string and this representation is unique.
Proof:
There always exists a non-periodic string W
which satisfies the condition S=W^x for x>=2. (Think why this is true. Take the smallest period of S and find the string corresponding to this period. It must be non-periodic.)
Now, we need to prove S can be represented by a unique non-periodic string W with length p. Let us assume this is not the case and S can also be represented by another non-periodic string T with length q. Without loss of generality assume p<q. Now S has two periods p and q. By lemma 2, S also has a period of gcd(p,q). Since T is divisible by gcd(p,q) and q>gcd(p,q), T has a period of gcd(p,q) and is a periodic string. But we assumed initially that T is a non-periodic string. Thus, we arrived at a contradiction and it implies that our assumption is wrong.
Having proved the lemmas, let’s proceed to the main problem.
-
Firstly, we need to note that the total number of strings of length i is 2^i
(the character at each position of the string can be 0 or 1)
. -
Let us try to find the number of periodic strings of length i. By lemma 3 and lemma 1, we know that every periodic string S of length i can be represented uniquely by some non-periodic string T of length j divisible by i i.e, S= T^x where x=\frac{i}{j}.
-
If we get the number of periodic strings of length i as num_p, then we get the number of non-periodic strings num_{np} of length i as num_{np} = 2^i - num_p.
-
To find num_p we need to add for every j <i divisible by i the number of non-periodic strings of length j since they help in forming a unique periodic string of length i (lemma 3).
-
This suggests a dynamic programming(dp) solution. Let us consider a dp of N states where
dp[i]
denotes the number of non_periodic strings of lengthi
. Here dp[i] = 2^i - num_p where num_p denotes the number of periodic strings of length i. -
Thus, we get dp[i] = 2^i - \sum_{j<i \ and \ j \in divisors(i) } dp[j].
-
We can precalculate 2^i(modulo \ M) for every 1<=i<=N in O(N) and store it in an array, say pow2.
pow2[0]=1; for(int i=0;i<n;i++) pow2[i]=(pow2[i-1]*2) % M;
-
For calculating dp[i], we need to iterate over all divisors of i(excluding \ i). This can be done in O(\sqrt[]{i}) as follows:
dp[i]=pow2[i]; for(int j=2;j*j<i+1;j++) { if(i%j==0) //Finding two factors j and i/j at the same time { dp[i]-=dp[j]; //For the factor j if(j!=1 && i/j!=j) // Taking care of (i/j not equal to i) and (i/j not equal to j) dp[i]-=dp[i/j]; // For the factor i/j } } dp[i] = (dp[i] % M + M) % M; //Taking care of modulo M
-
After calculating dp[i] for all i from 1 to N, we output dp[N] as our final answer.
BASE CASE:
-
dp[1]=2 ( since binary strings of length 1 can’t have any period and there are two possible
strings - "0" and "1").
TIME COMPLEXITY:
-
Precalculating the pow2 array takes O(N) time. Calculating the dp array takes \sum_{i = 1}^{N}\normalsize\sqrt{i} which has an upper bound of \sum_{i = 1}^{N}\normalsize\sqrt{N}=N\normalsize\sqrt{N}.
-
Therefore, the time complexity of the above algorithm is O (N\normalsize\sqrt{N}).
SPACE COMPLEXITY:
- Both pow2 array and dp array have N states. Therefore the space complexity of the above algorithm is O(N).
ALTERNATE EXPLANATION:
We can bring down the time complexity to O(N\log{N}) by precalculating proper divisors of every number using sieve of eratosthenes. The idea is to iterate over i from 1 to N and for each multiple of i(excluding \ i) <=N add i to their list of divisors.
Calculating the divisors takes time N*\sum_{i = 1}^{N}\frac{1}{i} which is O(N\log{N}). Similarly, storing the divisors takes O(N\log{N}) space.
SOLUTION:
Editorialist’s solution can be found here.
Please comment below if you have any questions, alternate solutions, or suggestions. This is my first editorial, so let me know if you like it.