TASTRMAT - Editorial

dpraveen · December 28, 2014, 2:03pm

PROBLEM LINK:

Practice
Contest

Author: Tuan Anh
Tester: Gedi Zheng
Editorialist: Praveen Dhinwa

DIFFICULTY:

Hard

PREREQUISITES:

Fast Fourier Transform, bitwise operations.

PROBLEM:

Given a binary string A of size N and B of size M. For each sub-string x of A having length |B|, you need to find hamming distance
between x and B.

Hamming distance of the two binary string of the same length is the number of the positions where the two strings have
different characters.

eg. the hamming distance of “01010101” and “00110111” is 3.
the hamming distance of a binary string with itself is 0.

QUICK EXPLANATION:

Primarily there are three possible solutions depending on the subtasks.

For passing the first subtask, you can directly use brute-force to calculate hamming distance between two strings.
For passing the second subtask, you can directly use brute-force with a small optimization of packing 64 bits into a single
integer which can be stored in unsigned long long data type in C++. Finding hamming distance between two strings x and y can be done
by bitwise operations which are normally executed using gates which takes almost constant CPU cycles.
You can make use of the fact that if you represent the string using polynomials, problem of finding hamming distance can be
converted to multiplying two polynomials which can be done faster using fast fourier transform.

EXPLANATION:

Let us start with subtasks in increasing order of difficulty.

Subtask 1:

As N and M are quite small, we can try all possible sub-strings x of A having length M and we can find hamming distance between
two string x and B. Finding hamming distance between two strings can be done in O(sum of sizes of strings).

For finding hamming distance between s and t, we can simply iterate over the string s and count the number of positions
where s and t have different characters.

Pseudo Code

calculateHammingDistance(s, t):
	ans = 0;
	for i = 0 to |s|:
		if s[i] != t[i]:
			ans++;
	return ans;

foreach sub-string x of A:
	calculateHammingDistance(x, B);

Subtask 2:

Strings are binary Strings !!!
Let’s make use of the fact that strings we are dealing with are binary strings. Can we somehow use bitwise operations to finding hamming
distance faster. Assume we have to find hamming distance between s and t. Let us pack s into s / 64 groups of integers. Each group
of integer represents 64 bits of s. Now we need to find out a way to evaluate number of bits where two integers differ.

Use bitwise operations
For that we can make use of xor operation. Let us understand xor operation, If the bits are same, then xor is zero, otherwise it is one.
So number of differing bits between x and y will be equal to number of set bits (bits having value equal to 1) of x ^ y
(here ^ represents bitwise xor operation).

Finding set bits in an integer
For finding set bits of an integer, we can assume __builtin_popcount in c++. Actually there are lot of ways of doing this. You
can have a look this wiki link and this blog.

Time Complexity
Se will initially divide the entire string A into A / 64 parts.
Then for each sub-string, we will have M / 64 operations for finding hamming distance. As there are N - M + 1 sub-strings.
So Overall time complexity will be O((N - M + 1) * M / 64) .

Subtask 3

Represent strings in terms of polynomials
Now we have to make slightly sophisticated use of polynomials. Let us represent our strings as polynomials.
Note that we will replace 0 with -1 in the string to get a polynomial of degree equal to size of string - 1.
eg.

String 010 will be a polynomial (-1) x^2 + 1 x + -1 = - x^2 + x - 1.
String 100 will be a polynomial (1) x^2 + -1 x + -1 = x^2 - x - 1
String 0 will be a polynomial -1.
String 1 will be a polynomial 1.

What does hamming distance represent in terms of polynomials?
Now let us see how to represent the hamming distance between two strings using coefficients of polynomials.

Take string s and t. Reverse the string t. Represent both the strings as polynomials as described above.
Now take the product of both the polynomials, you can observe the absolute value of coefficient of x^(|s| - 1) will be equal to number of equal bits -
number of different bits. As we know total number of bits, we can exactly find number of equal or different bits easily by
solving two linear equation in two variables.

Let’s take an example
Let s = 010, t = 100.
Initially we reverse the string t. So now t becomes 001.
String 010 will be a polynomial p = - x^2 + x - 1.
and String 011 will be a polynomial q = -1x^2 -1 x + 1.
Now see the coefficient of x^2 in product of both polynomials p and q.
it will be (-1 * 1 + 1 * -1 + -1 * -1) = -1. Absolute value of it will be 1.

Note that if both the bits are equal, then product of those is 1 (as 1 * 1 = 1 and -1 * -1 = 1).
If they are different, then the product will be -1. (As 1 * -1 = -1).
So overall the coefficient will be number of equal bits - number of different bits which in our case is 1.

As we know the number of different bits is precisely hamming distance.

We also know that number of equal bits + hamming distance = 3.
and number of equal bits - hamming distance = 1.

So solving these two equations, we get hamming distance = 2 and number of equal bits = 1.

Complete Solution
So we will take strings A and B. Reverse the string B. Represent both the strings as polynomials as described above.
Now instead of iterating over all the sub-strings of A. we will simply multiply the polynomials corresponding to String A and
reverse string of B.

For each sub-string, we can get the answer by looking at the corresponding coefficients. Please look into previous example
for better understanding.

eg.
If A = 0100 and B = 100.
Then we need to take care of coefficients of x^3 and x^2. Coefficient of x^3 can be used to find out hamming distance of
010(A[0:2]) and 100(B) and coefficient of x^2 can be used to find out hamming distance of 100(A[1:3]) and 100(B).

Multiplying Two polynomials
We can multiply two polynomial of deg n faster than n^2. We can use fast fourier transform(FFT) to do this in O(n log n) time.

For understanding FFT, you can use the following links.

Time Complexity:
Usually for FFT, your polynomials should usually be of degrees equal to some power of 2.
As FFT works in n log n where n is sum of degrees of polynomials which we are trying to multiply.
So Overall, it will O(N log N) where N is smallest power of two greater than or equal to size of string A + B.

Problems to Practice

Please add more problems for practice !!

AUTHOR’S, TESTER’S and Editorialist’s SOLUTIONS:

setter
tester

pushkarmishra · December 28, 2014, 2:14pm

Is fast fourier transform even there in the IOI syllabus?

I don’t think it is. What is the point of setting such a question in a competition meant for kids preparing for IOI?

xllend3 · December 28, 2014, 3:07pm

Because of the weak hash function, this problem can be passed using a simple O(n) solution. So it’s better to select a stronger hash function next time.

kuruma · December 28, 2014, 7:04pm

Hi,

I think almost all problems that appear on Codechef are pretty hard in general if one does not have the right preparation, or the right amount of training… That is true for me on almost every contest… (my all time best was solving 4 full problems+1 challenge problem)

I also pointed this out to @Suraj, for the short time where I was an editorialist for the lunchtime contests (I hope to be back as a setter around february/March for those who are wondering where I am =)) ) and I also “complained” about how hard these contests are becoming in general (you never wondered why only teams or Google Software Engineers/Algorithms researchers seem to win them? ) and even for me, it is becoming quite exhausting/demotivating to “insist” on trying to perform well on contests for which I’m simply not made for ^^

Either they plan to form highschool prodigies, or I believe that these kids will get somewhat mindblown or demotivated by how hard these contests are…It’s good to push oneself forward (for me, long contests give me that with DP+Trees problems which I almost never end up solving) but maybe pushing too forward might just make you fall of a cliff

I don’t mean this for myself nor to scare the pros away from here, but, maybe some adjustment should be considered to all contests to have more…appraochable problems, o at least more problems where one can actually learn by reading some university curriculum or something… The teachers I’ve spoken with at my university would be crushed here

Best,
Bruno

kuruma · December 28, 2014, 10:22pm

@dpraveen,

Like I said above, the problem is me and not the statements…Or the pre-requisites for each problem!!

I believe nobody is born a genius and that most people can be trained and I’m sure that my reality as someone from a country as Portugal where almost no incentive is given to computer programming competitions, is faaaaar different from your realities, possibly being in constant exposure to IOI problems, training for gold in these competitions, writing lots of problems and mastering basic algorithms from an early stage, etc, etc.

I only coded a DFS for the first time 1 year ago (and I’m 24 years old now, how’s that for a late start?). Given this, I am not making that bad of a progress imho, I even managed to solve some cool mathematical problems I’m proud of and definitely learnt a lot since I first came here…

The pre-requisites which you claim are needed to be known by any high school student (DFS Order, LCA, graphs, heavy-light decomposition, FFT, etc) are only thaught at the later years of a Bachelor’s degree here in Portugal or only in Master’s degree (meaning only someone really choosing Algorithms as a subject to master will be exposed to these concepts, so, as you see, this reality of having high school kids crushing everyone really “confuses me” and makes me feel dumb/useless).

I think that some scaling down should be done, according to my own reality, at least.

Instead of having only really, really skilled coders setting problems, I think a platform like Codechef would gain a lot if it had the direct intervention of a group of teachers from several universities (India or outside too) which would scale down the problems to be more suitable for “eager kids wanting to learn more” (or maybe I just don’t want to face the fact that 15 year old kids solved harder problems than I’ll ever solve in my lifetime as a competitor/enthusiast) or at least would follow any university curriculum more closely, I don’t really know or maybe my opinion is biased due to the fact that I’m a “loser”

What do you think? I mean, how did YOU trained for IOI yourself before there was a Codechef? When did you wrote your first DFS? In the end, it’s all up to how trained or smart someone is, but, maybe I would change some things, although I’m totally biased to be discussing this…

It’s just my two cents

dddlll · December 28, 2014, 11:53pm

hi guys,

I solved this problem for all the sub tasks only using pre-processing (but it was after the contest). please see the solution link. I just added the contribution due to ith bit of first string to the final answer in the main loop. Complexity O(n log n).

dpraveen · December 29, 2014, 3:07pm

@kuruma:

I agree with you completely that difficulty level of some tasks can be degraded.

I also think that lunchtime has small subtasks which are really cakewalk most of the times which makes it a good learning point for beginners. Though it is slightly disappointing for one to not get even one problem accepted. But one will learn a lot by this format and slowly this problem of not getting even one accepted will go away.

Scaling down difficulty level is an important idea and this is why I think codechef introduced subtasks in even long contests.

I would say that though these advance ideas/ algorithms seem complicated, they are actually not that complicated and can be understood by High school students specially people who are preparing for IOI. Many of the big coders learn them mostly in high school.

Personal Experience
Regarding advance algorithms, I feel same as you. I had learned them at my university. I learned C language in my first year at college. I got started with serious algorithmic programming at the end of 2nd year. The advanced topics some of them I have learned in 4th year or so. Still there are a lot of competitive programming algorithm which I don’t know. Also I did not train for IOI as I did not know about it all. I wrote my first dfs program in the summers of 2nd year.

Thank you for sharing your experience

gdisastery1 · December 28, 2014, 2:19pm

What’s the difference between this problem and Problem - G - Codeforces this?

neo1tech9_7 · December 28, 2014, 2:20pm

even lucas theorem or most of the modular arithmetic isn’t in the IOI syllabus which was required to solve the second question.

dpraveen · December 28, 2014, 2:23pm

@gdisastery1: Solution for that problem is much more complex by using data structures etc. It is not simple FFT as it is the case here.

dpraveen · December 28, 2014, 2:30pm

@pushkarmishra: Sometimes setters are not from IOI background, I also did not knew IOI syllabus I have notified admins, they have said that it won’t happen again.

pushkarmishra · December 28, 2014, 2:35pm

@dpraveen I have raised this point several times before. I even talked to Anup Kalbalia about this in person. I hope this won’t happen again.

neo1tech9_7 · December 28, 2014, 7:10pm

I think that problem difficulty should be increased but there should be easy subtasks that everyone could solve. This way no one will be demotivated and people who want to do hard problems will be satisfied too. Also IOI has much tougher problems so lowering the level of problems is not good.

kuruma · December 28, 2014, 7:20pm

IOI problems are harder? Really? Harder than using FFT? Clearly I’m still just a newbie when it comes to programming competitions… should have started earlier

neo1tech9_7 · December 28, 2014, 7:25pm

Well if someone doesn’t know FFT then obviously not. In IOI you have 5 hours to solve 3 problems and 1-3 people get perfect score out of hundreds. Now consider today’s lunchtime 2 problems required concepts out of the course of IOI to get full marks. Still 8 people got perfect score so I think IOI problems are definitely harder than lunchtime’s

neo1tech9_7 · December 28, 2014, 7:26pm

8 people in school*

kuruma · December 28, 2014, 7:35pm

Well yes, I understand that and I’m not saying otherwise, I’m just commenting that for me, as someone who is not so used to competitive programming and never ever attended IOI, SWERC, whatever, these problems are hard and untangling statements about trees/DP problems when I’ve only been exposed to the basics at university, it’s super hard… Add to that lack of previous experience and time to train and all of a sudden solving 6 problems or 5 problems consistently in the long challenge becomes a real challenge for me

dpraveen · December 28, 2014, 8:30pm

@kuruma: IMHO problems were not that hard, but you needed to know few concepts for solving that (prerequisite for solving the problems was more than expected from high school student)). But I think IOI problems are more harder than codechef lunchtime problems. Codechef has introduced subtasks to make sure every person learns something. So it should be interesting enough for everybody. What change you think can be done?

neo1tech9_7 · December 28, 2014, 9:26pm

@kuruma according to me challenge is the most fun part

pushkarmishra · December 28, 2014, 10:25pm

@dpraveen is right. The questions weren’t tough. Just required prerequisite knowledge of things which are not a part of the IOI syllabus. My IOI experience tells me that IOI problems are certainly longer than the codechef ones. Whether they are tougher, thats a subjective issue. My only request is that please remain within the scope of IOI as far as lunchtimes are concerned.