DISCHAR - Editorial

PROBLEM LINK:

Author: Lalit Kundu

Tester: Sergey Kulik

Editorialist: Florin Chirica

DIFFICULTY:

cakewalk

PREREQUISITES:

greedy

PROBLEM:

You’re given a string. Let’s consider a set of all possible strings obtained by deleting some letters from my initial string. From this set, we pick those for which every letter appears exactly once. What’s maximum length possible of picked string?

QUICK EXPLANATION:

The resulting string is the one for which each character from the input appears exactly once in the output.

EXPLANATION

We’ll make 2 observations and based on them, we’ll find the solution. Let’s denote by x the initial string and by res the string x after removing some characters (possibly none), such as no two characters are equal and the length of res is maximal.

Observation 1

If a character appears in x, then it must appear in res. Let’s proof it: suppose there’s a character ch such as it appears at least 1 time in x, but it appears zero times in res. Then, let’s add it to res. Since it didn’t appear before in res and res already contained each two characters different, after adding ch to res, res will still contain each two characters different. But length of res is increased by one. This is a contradiction of the fact that res was the answer (there exist a string res’ for which each two characters are different but length of res’ is greater than length of res). The contradiction was obtained from assuming observation 1 is false. Hence, observation was is true.

Observation 2

If a character appears two or more times in x, then it can appear at most once in res. This is a direct consequence of the property that res has each two characters different.

Combining the observations

Let’s iterate each character from x. We’re build incrementally string res, starting from res = NULL. Suppose we’re processing now the character ch. We have two cases.

  1. Character ch didn’t appear before in x.
  2. Character ch appeared before in x.

Let’s suppose we’re in case #1. Using observation 1, we’re forced to add the character to res. Now, let’s suppose we’re in case #2. Since character ch have already appeared in x, it has been already added in res (when it appeared first time in x). Using observation 2, we are not allowed to add the same character in res. So, we’re forced to skip this character ch (or equivalently, to delete it from x).

There’s a single detail remained unsolved. Suppose we processed the first i characters of x. We need to know whether a character ch (corresponding to position i + 1) already appeared on the first i characters. We can keep an array used[ch] = 1 if character ch already appeared or 0 otherwise. We can use this to know if character ch appeared at first i characters. Then, we update used[ch] = 1 (we know for sure that character ch appears in the first i + 1 characters, as it appears in position i + 1) and move on.

Complexity

Since the only thing we do is to iterate over elements of strings, the complexity is O(n).

AUTHOR’S AND TESTER’S SOLUTIONS:

To be updated soon
To be updated soon

1 Like

Also, I found a nice albeit silly optimization. It was possible to preempt the search by maintaining a count of the unique characters you’ve already encountered for the current string. Since the string can only contain letters ‘a-z’, maximum length of the result “res” can be only 26. Hence, as soon as count becomes 26, i.e., length of res becomes 26, you could just exit the loop.

3 Likes

https://www.codechef.com/viewsolution/12279576
my solution is getting WA unable to understand where I went wrong any help please!!!

#include
using namespace std;
int main()
{
long int test;
cin>>test;
while(test–)
{long int a[26]={0},count=0,maxlength=0, flag=0;
string s;
cin>>s;
for(long int i=0;s*!=0;i++)
{if(a[s*-‘a’]==0)
{a[s*-‘a’]++;
count++;
}
else if(a[s*-‘a’]==1)
{if(count>maxlength)
{flag=1;
maxlength=count;
for(int k=0;k<26;k++)
a[k]=0;
count=0;}
}
}
if(flag==0)
cout<<count<<"
“;
else
cout<<maxlength<<”
";

}
return 0;

}

IM GETTING PARTIALLY CORRECT ANSWER IT GIVES 80/100 IM UNABLE TO FIND WHICH TEST CASE I’M MISSING
I HAVE CREATED AN ARRAY FOR a TO z WHENVER AT ANY INDEX VALUE BECOMES 2 I START AGAIN ANS SET MY ARRAY TO ZERO

PLEASE HELP…

For some reason, i thought the problem was to find the largest sub-string in which consecutive characters are distinct whereas the problem is basically to find the number of distinct characters in the input string.