Z-Algorithm[Tutorial]

Z-Algorithm

Note: This is not a complete tutorial where the complete concept is explained, though a link is provided to a tutorial where it is explained. I have just explained the basic overview and used it in two problems.

I find Z-algorithm to be one of the most useful and simple algorithms ever. It is very useful in competitive programming and comparatively much faster to code than KMP or sometimes even Naive.

Z-Algorithm is just a simple calculation of function z[i] on string. Where z[i] is the maximum length such that

substring(0...z[i]-1) = substring(i...i+z[i]-1)

It can be coded in just 4 lines, really.

vector<int> z(a.size(),0);
for(int i = 1, l = 0, r = 0; i < (int)a.size(); ++i){
	if(i <= r) z[i] = min(r-i+1, z[i-l]);
	while(z[i] + i < (int)a.size() && a[z[i]] == a[z[i]+i])z[i]++;
	if(i+z[i]-1 > r)l = i, r = z[i]+i-1;
}

Well understand algorithm is important, so go and learn all you want to here: Z-Function

Let’s talk about some uses in competitive programming (Codes are provided at the end).

  1. Simple Use - String Matching.
    Sample Question: NAJPF
    The fun thing here is the meaning of z[i] here itself. If we find Z-values of String A = pattern + "$" + text, that can any z[i] after index pattern.size() having length of pattern.size() matches the pattern itself. What’s even more fun is that no index before (or on) index pattern.size() can be equal to pattern.size(), so question is simply indices with z[i] == pattern.size()
  2. A little more awesome use:
    Sample Question: Prefix-Suffix Palindrome
    Spend some time on the question. And you will observe the following:
    Take as many characters from start and end as you can while it makes a palindrome and then lets see what we can do with the remaining string.
    Example: String = abcdfdcecba
    Let’s take abc prefix and cba suffix and we are left with dfdce
    Now it is guaranteed if we want a palindrome, we can only take from either prefix or suffix. But what to take? Note that if you have a palindrome A and you delete some prefix and suffix of length l, then what’s left is also a palindrome. So we need the longest prefix palindrome or suffix palindrome, whichever is longer.
    How can Z-values help here?
    Notice that if z[i] is z-value of index i for any string a, then if z[i]+i == a.length(), then this means that suffix from i matches prefix? So this can help us find pallindrome? Yes.
    How?
    Consider this string A = ABACD, let’s find the longest prefix palindrome.
    Consider B = rev(A) = DCABA, do you notice that the longest palindrome, is actually longest suffix matching in B matching prefix of A.
    Is this magic? No, just math.
    Any suffix of B of length l in originality is just rev(a[0...l-1])
    So on matching it becomes a[0...l-1] == rev(a[0...l-1])
    Can same be done to find longest suffix pallindrome? Yes.
    So for longest palindromic prefix find the longest suffix matching prefix in string A = A+"$"+rev(A)
    So for longest palindromic suffix find the longest suffix matching prefix in the string A = rev(A)+"$"+A
    And then it is just simple string construction.
Code for problem A
#include<bits/stdc++.h>
using namespace std;

int main(){
    ios_base::sync_with_stdio(false);
    cin.tie(0); cout.tie(0);
    int t;
    cin >> t;
    while(t--){
        string text, pattern;
        cin >> text >> pattern;
        text = pattern + "$" + text;
        vector<int> z(text.size());
        vector<int> ans;
        for(int i = 1, l = 0, r =0 ;i < (int)text.size(); ++i){
            if(i <= r)z[i] = min(z[i-l],r-i+1);
            while(z[i]+i < (int)text.size() && text[z[i]] == text[z[i]+i])z[i]++;
            if(z[i]+i-1 > r)l = i, r = z[i]+i-1;
        }
        for(int i = 0; i < (int)text.size(); ++i){
            if(z[i] == (int)pattern.size())ans.emplace_back(i-(int)pattern.size());
        }
        if((int)ans.size() == 0)cout << "Not Found";
        else{
            cout << ans.size() << "\n";
            for(int &i: ans)cout << i << " ";
        }
        if(t > 0)cout << "\n\n";
    }
    return 0;
}
Code for problem B
#include<bits/stdc++.h>
using namespace std;

int longestPallindromicPrefix(string a){
	if(a.size() == 0)return 0;
	string res = a;
	std::reverse(res.begin(),res.end());
	string s = a+"#"+res;
	vector<int> z(s.size(),0);
	for(int i = 1, r = 0, l = 0; i < s.size(); ++i){
		if(i <= r)z[i]=min(r-i+1,z[i-l]);
		while(z[i]+i < s.size() && s[z[i]] == s[z[i]+i])z[i]++;
		if(z[i] + i -1 > r)l=i,r=z[i]+i-1;
	}
	int ans = 1;
	for(int i = 0; i < s.size(); ++i){
		int lrem = s.size()-i;
		if(z[i] == lrem)ans=max(ans,lrem);
	}
	return ans;
}

int solve(){
    string a = "", b = "" , s;
	cin >> s;
	if(s.length() == 1){
		cout << s << "\n";
		return 0;
	}
	int i = 0, j = s.size()-1;
	while(i < j){
		if(s[i] == s[j]){
			a+=s[i++];
			b+=s[j--];
		}else{
			break;
		}
	}
	int save = i;
	string rem = "";
	while(i <= j){
		rem+=s[i++];
	}
	int pref = longestPallindromicPrefix(rem);
	reverse(rem.begin(),rem.end());
	int suff = longestPallindromicPrefix(rem);
	i = save;
	if(suff >= pref){
		for(int k = 0; k < suff; ++k){
			b+=s[j-k];
		}
	}else{
		reverse(rem.begin(),rem.end());
		for(int k = 0; k < pref; ++k){
			a+=s[i+k];
		}
	}
	reverse(b.begin(),b.end());
	return cout << a << b << "\n",0;
}

int main(){
	int t;
	cin >> t;
	while(t--)solve();
	return 0;
}

Please do leave a like if this helps you in any way. And also do tell if you think I should improve something. Not much on the writing side.

11 Likes