MINMXSTR - Editorial

PROBLEM LINK:

Practice
Dementia 2020 (Div.2) Contest

Author: Swapnil Rustagi
Testers: Vishal Anand , Smit Mandavia
Editorialist: Swapnil Rustagi

DIFFICULTY:

MEDIUM

PREREQUISITES:

Trie, Map/Set data structures in C++ (or their equivalent in other languages), and LCA using Binary Lifting (although there is an approach without this).

PROBLEM:

Initially given a collection N distinct strings such that their total length does not exceed 2*10^5, handle at most 2*10^5 queries of 3 types. The strings are assinged indices from 1 to N in the order of input.

Type 1- having an index idx (idx ranges between 1 and the number of strings in the collection) and a character c. Create a new string formed by concatenating the existing string in the collection with index idx with character c, and insert in into the collection. if p strings existed in the collection, then this new string is assigned index p+1.

Type 2 and 3 - Print the indices of the lexicographically minimal and maximal strings in the collection, which are not prefixes of any other string in the collection.

QUICK EXPLANATION

Let us assume that the current lexicographically minimal string is s=s_1s_2...s_ps_{p+1}...s_n. Then a newly added string can be lexicographically minimal only if either, the query of type 1 has current lexicographically minimal string as the index (here, the current lexicographically minimal string would became a proper prefix which is not allowed, and so the answer changes to the index of the string being inserted in this query) or, this newly added string is formed by a prefix of s, so the newly added string must be of the form s'=s_1s_2...s_pc where c is the character in query of type 1, and c < s_{p+1}. The conditions are very similar for lexicographically maximal string, except c>s_{p+1}.

The strings need to be stored in a Trie, storing index and depth in each node, and jump pointers for binary lifting. Now, checking the prefix condition simply requires checking ancestry using LCA (Binary Lifting).

The required time complexity for any solution is at least O((N+Q)(log (N+Q))).

EXPLANATION:

Build a trie of the initial collection - the nodes should have fields for index and depth, in addition to the children array. Also have an array of pointers, such that arr[i] points to the trie node containing the last character of string with index i.
First, take traverse the trie, starting from the root, going to the left-most non-null node each time until you encounter a leaf node, the index stored here is the answer to query of type 2. Similarly for queries of type 3, except go to right-most non-null node.

Store these answers for queries of type 2 and 3, and just output them when given queries are given. Now we need to handle queries of type 1, and if adding a string changes the answers to queries of type 2 or 3, update the answers.

Let’s discuss the case of lexicographically minimal string.

Let’s say string s is the current lexicographically minimal string, and the corresponding index is lexmin. There are two cases where the answer to query of type 2 could change -

  1. A query of type 1 having index=lexmin. This would cause s to become a proper prefix of this new string being added. This is simple to handle, just update the answer to query 2 to be the index of this string being added.

  2. A query of type 1 having index idx such that the node pointed by arr[idx] is an ancestor of arr[lexmin] - in other words, there is a string whch is a proper prefix of the current lexicographically minimal string s. Checking ancestry is done using Binary Lifting and node depth.

Consider s=s_1s_2...s_ps_{p+1}...s_n
And the newly added string to be s'=s_1s_2...s_pc, where c is the character in the query.

if character c < s_p, then our answer changes.

So after checking ancestry, we check which is the leftmost non-NULL child of the node pointed to by arr[idx]. If the character we are adding in query 1 is smaller, then answer changes.

For checking ancesry, we need Binary Lifting using jump pointers - these pointers need to be computed for all nodes. You may refer to the functions build_jump_pointers and isAncestor in the author’s solution for more details regarding implementation.

ALTERNATE METHOD WITHOUT BINARY LIFTING

This is tester Vishal Anand’s solution (the code is given below), and is easier to implement.

Store all the queries and process them offline in the following manner. Let K be the total number of strings after all queries are completed (simply N+queries of type 1).

The approach is to make a trie with nodes having fields {character, index, rank from 1 to K}). This rank is only for nodes which correspond to the last character of some string. Also make a map or set (std::map/std::set in C++, or equivalent in other languages), to store {rank, index}, i.e. the map/set is ordered by rank. At this point this map/set is empty.

How to compute rank? After having added all nodes according to the queries, do an in-order traversal of the trie in linear time, storing the appropriate ranks in nodes. While doing this in-order traversal, when encountering a node having index <= N, insert {rank,index} of this node into the map/set.

So now, this map contains the initial strings ranked in correct lexicographical order (although the ranks are between 1 to K, instead of N, the relative ranks are correct).

Now we go over the queries one by one, the answers to queries of type 2 and 3 are first and last elements of this map/set, both accessible in O(1). For queries of type 1, let’s say the index in the query is idx. Access the corresponding node in the trie, which also stores it’s rank. Now find the element with this rank in the map/set and delete it. Also get the node in the trie corresponding to last character of string inserted in this query. Get it’s {rank, index} and insert into the map/set.

SOLUTIONS:

Setter's Solution (O(log n) per query with Binary Lifting)
#include <bits/stdc++.h>
using namespace std;
#define ll long long
#define loop(i,a,b) for (int i=a; i<b; ++i)
 
typedef struct TrieNode {
    int index;
    TrieNode * parent;
    int depth;
    vector < TrieNode *> children;
    TrieNode()
    {
        depth=0;
        index=-1;
        parent=NULL;
        children.assign(26, NULL);
    }
} TrieNode;
 
string  getString(TrieNode * node)
{
    TrieNode * parentnode=node->parent;
    string ans="";
    while (parentnode!=NULL)
    {
        for (int i=0;i<26; ++i)
        {
            if (parentnode->children[i]==node) 
            {
                ans+=((char)('a'+i));
                break;
            }
        }
        node=parentnode;
        parentnode=parentnode->parent;
    }
    reverse(ans.begin(), ans.end());
    return ans;
}
 
void build_jump_pointers(TrieNode * node, unordered_map <TrieNode *, vector <TrieNode *> > &jump_pointers)
{
    int pow2=1, size=1;
    for (int i=1; i<=21; ++i)
    {
        pow2*=2;
        if (node->depth < pow2)
        {
            break;
        }
        else size=i+1;
    }
    jump_pointers[node].resize(size);
    jump_pointers[node][0]=node->parent;
    for (int i=1; i<size; ++i)
    {
        jump_pointers[node][i] = jump_pointers[jump_pointers[node][i-1]][i-1];
    }
}
 
TrieNode * addstring(TrieNode * root, const string &s, int index, unordered_map < TrieNode *, vector <TrieNode *> > &jump_pointers)
{
    TrieNode * node=root;
    for (auto c: s)
    {
        if (node->children[c-'a']==NULL) {
            node->children[c-'a']=new TrieNode;
            node->children[c-'a']->parent=node;
            node->children[c-'a']->depth=node->depth+1;
            build_jump_pointers(node->children[c-'a'], jump_pointers);
        }
        node=node->children[c-'a'];
    }
    assert(node->index==-1);
    node->index=index;
    return node;
}
 
int minString(TrieNode * root)
{
    TrieNode * node=root;
    while (1)
    {
        bool leaf=true;
        for (int i=0; i<26; ++i)
        {
            if (node->children[i]!=NULL)
            {
                leaf=false;
                node=node->children[i];
                break;
            }
        }
        if (leaf) break;
    }
    return node->index;
}
 
int maxString(TrieNode * root)
{
    TrieNode * node=root;
    while (1)
    {
        bool leaf=true;
        for (int i=25; i>=0; --i)
        {
            if (node->children[i]!=NULL)
            {
                leaf=false;
                node=node->children[i];
                break;
            }
        }
        if (leaf) break;
    }
    return node->index;
}
 
bool isAncestor(TrieNode * u, TrieNode *v, unordered_map <TrieNode *, vector <TrieNode *> > &jump_pointers)
{
    if (v->depth < u->depth) return false;
    if (v->depth == u->depth) {
        if (v==u) return true;
        return false;
    }
    int diff = v->depth - u->depth;
    TrieNode * node = v;
    for (int i=0; i<=20; ++i)
    {
        if (diff & (1<<i)) {
            node=jump_pointers[node][i];
        }
    }
    if (node==u) return true;
    return false;
}
 
int main()
{
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);
    int T; cin >> T;
    while (T--)
    {
        TrieNode * root = new TrieNode;
        ll N;
        cin >> N;
        vector < TrieNode *> leaves(3e5, NULL);
        unordered_map < TrieNode *, vector <TrieNode *> > jump_pointers;
        int nextindex=1;
        loop(i,0,N)
        {
            string s;
            cin >> s;
            leaves[nextindex]=addstring(root,s, nextindex, jump_pointers);
            ++nextindex;
        }
        //loop(i,0,N) cout << getString(leaves[i+1]) << endl;
        int minindex=minString(root);
        int maxindex=maxString(root);
        ll Q;
        cin >> Q;
        for (int z=1; z<=Q; ++z)
        {
            int type; cin >> type;
            if (type==1) {
                int index; char c;
                cin >> index >> c;
                assert(leaves[index]->index!=-1);
                assert(leaves[index]->children[c-'a']==NULL || leaves[index]->children[c-'a']->index==-1);           
                if (leaves[index]->children[c-'a']==NULL) 
                {
                    if (isAncestor(leaves[index], leaves[minindex], jump_pointers))
                    {
                        if (index==minindex) {
                            minindex=nextindex;
                        }
                        else {
                            int smallest=-1;
                            for (int i=0; i<26; ++i) {
                                if (leaves[index]->children[i]!=NULL)
                                {
                                    smallest=i;
                                    break;
                                }
                            }
                            assert(smallest!=-1);
                            if ((c-'a') < smallest) {
                                minindex=nextindex;
                            }
                        }
                    }
                    if (isAncestor(leaves[index], leaves[maxindex], jump_pointers))
                    {
                        if (index==maxindex) {
                            maxindex=nextindex;
                        }
                        else {
                            int largest=-1;
                            for (int i=25; i>=0; --i) {
                                if (leaves[index]->children[i]!=NULL)
                                {
                                    largest=i;
                                    break;
                                }
                            }
                            assert(largest!=-1);
                            if ((c-'a') > largest) {
                                maxindex=nextindex;
                            }
                        }
                    }
                    leaves[index]->children[c-'a']=new TrieNode;
                    leaves[index]->children[c-'a']->depth = leaves[index]->depth+1;
                    leaves[index]->children[c-'a']->parent=leaves[index];
                }
                leaves[index]->children[c-'a']->index=nextindex;
                leaves[nextindex]=leaves[index]->children[c-'a'];
                build_jump_pointers(leaves[nextindex], jump_pointers);
                //cout << getString(leaves[nextindex]) << endl;
                ++nextindex;
            }
            else if (type==2) cout << minindex << "\n";
            else if (type==3) cout << maxindex << "\n"; 
        }
    }
    return 0;
}    
Tester (Vishal Anand)'s Solution (without Binary Lifting)
#include <bits/stdc++.h>
using namespace std;
 
const int ALPHABET_SZ = 26, MX1 = 2e5 + 7;
int T, N, Q, type, K, curr_rank;
string s;
map<int, int> rank_index;
        
struct queries {
    int type, index;
    char ch;
};
 
queries Qarr[MX1];
 
struct TrieNode { 
    struct TrieNode *next[ALPHABET_SZ];
    int rank, index;
    char ch;
    bool isEnd, invalid; 
};
 
TrieNode *PtrArr[MX1];
 
TrieNode *head = NULL;
 
TrieNode *newNode() {
    TrieNode *node =  new TrieNode; 
    node->isEnd = node->invalid = 0;
    node->rank = node->index = -1;
    for (int i = 0; i < ALPHABET_SZ; i++) 
        node->next[i] = 0; 
    return node; 
} 
 
void insert() {
    TrieNode *node = head;
    for(int i = 0; i < s.length(); i++) {
        if(!node->next[s[i] - 'a']) node->next[s[i] - 'a'] = newNode();
        node = node->next[s[i] - 'a'];
    }
    node->isEnd = 1;
    node->index = K;
    PtrArr[K] = node;
}
 
void insertFromNode(TrieNode *node, char ch) {
    if(!node->next[ch - 'a'])   node->next[ch - 'a'] = newNode();
    node = node->next[ch - 'a'];
    node->isEnd = 1;
    node->index = K;
    PtrArr[K] = node;
}
 
void updateTrieAndMap(TrieNode *node, char ch) {
    if(rank_index.find(node->rank) != rank_index.end())
        rank_index.erase(node->rank);
    // node->invalid = 1;
    node = node->next[ch - 'a'];
    if(!node->invalid)
        rank_index[node->rank] = node->index;
}
 
bool preProcess(TrieNode *node, int immediate_parent_index) {
    bool prv_invalid = 0;
    if(node->isEnd && (node->index < N)) {
        if(immediate_parent_index != -1) {
            PtrArr[immediate_parent_index]->invalid = 1;
            prv_invalid = 1;
        }
        immediate_parent_index = node->index;
    }
    if (node->isEnd) node->rank = curr_rank++;
    for(int i = 0; i < ALPHABET_SZ; i++) {
        if(node->next[i]) {
            node->invalid |= preProcess(node->next[i], immediate_parent_index);
        }
    }
    return prv_invalid | node->invalid;
}
 
void fillInitialMap(TrieNode *node) {
    if(node->isEnd && node->index < N && !node->invalid)
        rank_index[node->rank] = node->index;
    for(int i = 0; i < ALPHABET_SZ; i++) {
        if(node->next[i])
            fillInitialMap(node->next[i]);
    }
}
 
void printTrie(TrieNode *node, string ss) {
    if(node->isEnd) {
        cout << ss << " {" << node->rank << ", " << node->index << "},";
    }
    for(int i = 0; i < ALPHABET_SZ; i++) {
        if(node->next[i]) {
            string s2 = ss;
            s2.push_back('a' + i);
            printTrie(node->next[i], s2);
        }
    }
}
 
int main() {
    
    cin >> T;
    while(T--) {
        rank_index.clear();
        for(int i = 0; i < MX1; i++)    PtrArr[i] = NULL;
        head = newNode();
        cin >> N;
        for(int i = 0; i < N; i++) {
            K = i;
            cin >> s;
            insert();
        }
        curr_rank = 1;
        cin >> Q;
        for(int i = 0; i < Q; i++) {
            cin >> Qarr[i].type;
            if(Qarr[i].type == 1) {
                K++;
                cin >> Qarr[i].index >> Qarr[i].ch;
                Qarr[i].index--;
                insertFromNode(PtrArr[Qarr[i].index], Qarr[i].ch);
            }
        }
        preProcess(head, -1);   
        fillInitialMap(head);
        // printTrie(head, "");
        for(int i = 0; i < Q; i++) {
            if(Qarr[i].type == 1) {
                updateTrieAndMap(PtrArr[Qarr[i].index], Qarr[i].ch);
            } else if(Qarr[i].type == 2) {
                cout << rank_index.begin()->second + 1 << endl;
            } else {
                cout << rank_index.rbegin()->second + 1 << endl;
            }
        }
    }
    
	return 0;
}
2 Likes

Can this question be done without use of trie/binary lifting or advanced data structures??

Can’t be done without a trie, but yes there is a simpler method without least common ancestor - which is the tester’s solution which has been shared. This editorial is not finished yet, I will describe the solution without binary lifting/LCA as well.

1 Like

Very interesting problem

The editorial is now complete.

@m_scofield Please look at the alternate method - it doesn’t require any advanced data structures beyond trie (which is actually not so advanced).

@humane thank you.

1 Like

Can anyone please figure out why the code given fails. It was giving SIGCONT error.
I had tried the recursive method as discussed in the tester’s code.
Thanks. :slightly_smiling_face:
Here’s my code.

Your code is just brute-force (which won’t pass even if you resolve the runtime error) and it’s in no way related to the tester’s code. Read the editorial again.

Thanks swapnil. Sorry for that, I’ll check the editorial once again.