question : Problem - D - Codeforces
I have read the editorial but didn’t understand it! I am weak in maths so, can someone please explain it more!
question : Problem - D - Codeforces
I have read the editorial but didn’t understand it! I am weak in maths so, can someone please explain it more!
Suppose we look at the first test case provided, which has k=1, p_a=1 and p_b=1.
Let E(aaa) be the Expected number of subsequences of the form ab if we start with string aaa and extend the string as described in the question. Then
But E(aaab)=3 because aaab contains three subsequences of the form ab, so we have
Repeat this logic a few times and we find
More algebra to evaluate this series will show that E(aaa) = 4.
Similarly we could show E(aa)=3 and E(a)=2.
Finally, we have to evaluate E(), the expected number of subsequences ab given that we start with an empty string. Following the rules, we have
But E(b) = E() because an initial b doesn’t add any extra ab subsequences. And we already have E(a)=2. So we can solve to find E() = 2.
More generally, if k \gt 2, then it is almost obvious that E(abaab) = E(aabba) because after additional characters are appended to the two strings, the number of subsequences of the form ab will be identical. The strings abaab and aabba both contain three a and four ab subsequences Thus, in the notation of the codeforces tutorial,
.
Hope that helps a bit.
Write E(i,j) for the expected number of subsequences ab given that we start with a sequence containing i letter a subsequences, and j subsequences ab. My E(i,j) is the same as dp[i][j] from the codeforces tutorial, but with slightly less typing.
The main relation linking the E(i,j) values is (from codeforces tutorial)
The two parts on the right hand side correspond to adding letter a and letter b respectively.
Swapping the order, and expanding E(i+1,j) gives
And if we now expand E(i+2,j), and then E(i+3,j), and so on we produce an infinite series
If i+j \ge k, then E(i+r,i+j+r) = i+j+r (this is in the codeforces tutorial, and follows directly from the codeforces question). So we have
We know (or can look up) standard formulae that give
and
Substituting, and after a little manipulation to simplify, we find the expression
Let’s try to apply this result to solve the second sample case from codeforces, with k=3, p_a=1, and p_b=4. The first four values come from this expression with i+j \ge k:
Then we apply the main dynamic programming expression:
The last step requires a little manipulation:
This answer is an attempt to provide some explanation behind the main recurrence relation
If S is a sequence containing i subsequences of the form a and j subsequences of the form ab, then if you append an a you have i+1 subsequences of the form a, and if you append a b then you have i+j subsequences of the form ab.
But I think the harder piece is to understand what is meant by “Expected number of subsequences” of the form ab.
The codeforces question asks us to find the expected number of times
ab is a subsequence in the resulting sequence.
Or expressed another way, we are asked to:
There are an infinite number of resulting sequences, so don’t try to
write them all down. The final add-them-all-up stage is an infinite sum,
which makes it tricky. But the underlying concept remains: multiply
the probabilities by the counts, and add all the products together.
As an example of calculating one of the probabilities: if k = 2, what is the probability that abaab is a resulting sequence? Answer:
\left( \frac {p_a} {p_a+p_b} \right) ^ 3 \left( \frac {p_b} {p_a+p_b} \right) ^ 2.
The solution method introduces the concepts of conditional
probabilities and conditional expectations. So, for example, we ask
ourselves what is the probability that abaab is a resulting sequence
given that the sequence starts ab? Answer: we need to add two more
a letters and one more b letter, so the probability is \left(
\frac {p_a} {p_a+p_b} \right) ^ 2 \left( \frac {p_b} {p_a+p_b}
\right).
We move on to conditional expectation. How do we calculate the expected number of times that ab is a subsequence of the resulting sequence given that the resulting sequence starts ab? Follow the same sort of logic as above:
The next steps are tricky, but also almost obvious. When we make a list
of all the resulting sequences starting ab, then some of them will
start aba, and some of them will start abb. Let S be a resulting
sequence that starts aba. Write P(S|ab) for the probability that
S is a resulting subsequence given that it starts with ab, and
similarly write P(S|aba) for the probability that S is a
resulting subsequence given that it starts with aba. We calculate
these probabilities by counting the number of a and b after the
initial ab and aba respectively. The number of a in S after ab is one more that the number of a after aba. Thus we have
Now think about the add-them-all-up stage when calculating
E(ab). For each possible resulting sequence, we are adding the
product of the probability and the count of ab subsequences. In symbols we can write this expression in the form
Some of
the resulting sequences start aba and some start abb, and we could
start by adding up the two sorts separately, which gives an expression of the form
As already discussed, when S starts with aba we have P(S|ab) = \frac {p_a}{p_a + p_b} P(S|aba). Similarly when S starts with abb we have P(S|ab) = \frac {p_b}{p_a + p_b} P(S|abb), so
and finally we can substitute for the two sums, giving
This last equation is an explicit example of the recurrence relation
which we are trying to understand.
Sorry this explanation is so long. The underlying ideas are not hard once you have grasped something about conditional probability. You can try asking me again where it is not clear.
please help me.
dp[3][4] = E(abaaba) = E(aabbaa)
is the last ‘a’ is a typo or it’s correct? And if it is correct then please explain this and also if possible please write a little further explaining how this dp will handle the infinite case as ‘a’ can go extremely large!
Anyways thanks a lot for this. really appreciate your hard work in writing this. Happy new year:)
Sorry, that was a typo. Now corrected.
Correct expressions are
and
last question please!
How the recurrence comes? : “The main relation linking the E(i,j)E(i,j) values is (from codeforces tutorial)”
Yes conditional probability is the main key. Now, i think that i have understood it. Going to implement it today. If i faced any difficulty then i will ask.
thanks a lot mate! You helped me a lot …May 2018 brings you happiness
thanks finally after 3 days i have implemented it!