ARTBALAN - EDITORIAL

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2

Setter: Pranjal Rai
Tester: Alexey Zayakin and Yash Chandnani
Editorialist: Taranpreet Singh

DIFFICULTY:

Easy

PREREQUISITES:

Greedy and Basic Math would do.

PROBLEM:

Given a string S consisting of uppercase characters only, you are to find out the minimum number of characters to change so as to make the string balanced. A string is balanced if all distinct characters present in string occur the exact same number of times.

QUICK EXPLANATION

  • In the final string, C*F = |S| where C is the number of distinct characters while F is the frequency of each character. Also, constraint 1 \leq C \leq 26 holds.
  • So, We check all 26 values of C and count the maximum number of characters from the original string we can keep. It can be easily seen, that maximum characters that can be kept are \sum_1^C min(x[j], F) where x[j] denote jth greatest frequency among the set of frequencies of characters present in the initial string.

EXPLANATION

Suppose in final string, each character occurs F times and there are C distinct characters present in the string. It can be seen that the length of such string is C*F. But we never change the length of the string. So, We need to consider only those values of C and F such that C*F = |S|. Also, we cannot have more than 26 distinct characters, so constraint 1 \leq C \leq 26 hold.

So, letā€™s check each value of C. If |S| \bmod C \neq 0 the string cannot be balanced using C distinct characters, so ignore such values of C. Otherwise, the balanced string shall contain |S|/C occurrences of each of distinct C characters. The question here is, who to identify which character to change?

To understand this, instead of changing the minimum number of characters, let us preserve the maximum number of characters. We can see, that we can choose any of the distinct C characters present, and preserve at most F = |S|/C occurrences of each chosen character. It makes sense to choose the characters with maximal frequencies and try to preserve maximum characters. If we sort the character frequencies in decreasing order, it can be seen that its optimal to choose first min(C, D) (D denote the number of distinct characters in initial string) characters and for each of these characters, preserve min(f, |S|/C) characters. The letters not preserved need to be replaced.

Hence, we can check for each value of C which gives the minimum number of characters to be changed and print this minimum number of characters to be changed.

Time Complexity

The time complexity is O(|S|+A*log(A)) per test case where A = 26 is the size of the alphabet.

AUTHORā€™S AND TESTERā€™S SOLUTIONS:

Setterā€™s solution
Testerā€™s solution
Editorialistā€™s solution

Feel free to Share your approach, If it differs. Suggestions are always welcomed. :slight_smile:

2 Likes

The Setterā€™s Solution, Testerā€™s Solution, and Editorialistā€™s Solution arenā€™t visible. :frowning:

Nice and very clean code for a complex looking problem.

If |S| mod C eq 0 the string cannot be balanced using C distinct characters, so ignore such values of CC.

Shouldnā€™t it be not equal to?

@taran_1407

To understand this, instead of changing the minimum number of characters, let us preserve the maximum number of characters. We can see, that we can choose any of the distinct CC characters present, and preserve at most F = |S|/CF=āˆ£Sāˆ£/C occurrences of each chosen character. It makes sense to choose the characters with maximal frequencies and try to preserve maximum characters. If we sort the character frequencies in decreasing order, it can be seen that its optimal to choose first min(C, D)min(C,D) (DD denote the number of distinct characters in initial string) characters and for each of these characters, preserve min(f, |S|/C)min(f,āˆ£Sāˆ£/C) characters. The letters not preserved need to be replaced.

canā€™t understand this.