PROBLEM LINK:Author: Sergey Nagin DIFFICULTY:Medium PREREQUISITES:Combinatorics PROBLEM:Two strings $X$ and $Y$, each of length $n$, are similar if they can be made equal by applying the following operation at most once in each of them.
Given a string $A$ of length $n$, find the number of ordered pairs $(X,Y)$ of nonsimilar permutations of $A$, modulo $10^9 + 7$. QUICK EXPLANATION:An equivalent definition for similar would be: $X$ and $Y$ are similar if you can turn $X$ into $Y$ by performing at most two swaps. For every lowercase letter $c$, let $A_c$ be the number of occurrences of $c$ in $A$. Then the number of permutations of $A$ is $\frac{n!}{A_{\text{"a"}}!A_{\text{"b"}}!\cdots A_{\text{"z"}}!}$. Let this be $P$. We count the number of similar pairs of strings instead. The total number of pairs is $P^2$, so if $S$ is the number of similar pairs, then the number of nonsimilar pairs is $P^2  S$. Let $T$ be the number of strings similar to $A$. Then it can be shown that $S = PT$. Thus, all we need is to compute $T$. Finally, to compute $T$, we will need to separately count four kinds of similar strings formed from $A$:
After computing the $A_c$ values, these counts can be computed in $O(1)$ time each. EXPLANATION:String similarity and swap distanceLet's define the swap distance between two strings $X$ and $Y$ as the minimum number of swaps needed to turn $X$ into $Y$. We denote this by $D(X,Y)$. Note the following facts:
We can redefine string similarity in terms of swap distance as follows: Two strings $X$ and $Y$ are similar if and only if there exists a string $Z$ such that $D(X,Z) \le 1$ and $D(Z,Y) \le 1$. But this also implies that $D(X,Y) \le D(X,Z) + D(Z,Y) \le 1 + 1 = 2$. (In other words, it only takes at most $2$ swaps to turn $X$ to $Y$, namely, first turn $X$ to $Z$, then $Z$ to $Y$). Thus, one necessary criterion for two strings being similar is that $D(X,Y) \le 2$. But is this a sufficient criterion? Actually, yes, because if $D(X,Y) \le 2$, then there must exist some string $Z$ such that $D(X,Z) \le 1$ and $D(Z,Y) \le 1$, namely:
Thus, we now have an equivalent definition for string similarity: $X$ and $Y$ are similar if and only if $D(X,Y) \le 2$. Thus, the problem is equivalent to counting the number of ordered pairs of nonsimilar permutations of $A$ with swap distance $> 2$. ObservationsAs usual, we'll need a few more observations / insights to make our counting problem easier to solve. The first observation is that it's easier (or at least it seems to be easier) to count the number of pairs of similar permutations instead. This gives us our first observation. Let $P$ be the number of permutations of string $A$, and let $S$ be the number of pairs of similar permutations of $A$. Then the number of pairs of nonsimilar permutations of $A$ is $P^2  S$. To see this, note that $P^2$ is the number of pairs of permutations of $A$ (because there are $P$ choices for the first and second element of the pair), so if we subtract from it the number of similar pairs $S$, we get $P^2  S$, the number of nonsimilar pairs. Thus, we can focus on computing $P$ and $S$ instead. But $P$ is easy to compute using multinomial coefficients: For every lowercase letter $c$, let $A_c$ be the number of occurrences of $c$ in $A$. Then $$P = \frac{n!}{A_{\text{"a"}}!A_{\text{"b"}}!\cdots A_{\text{"z"}}!}.$$ (There is a possible source of confusion here, namely the usage of lowercase letters to denote variables and the actual characters themselves. To avoid ambiguity, we will always use quotes whenever we refer to the actual character. Without quotes, it always denotes a variable. Thus, the $c$ in $A_c$ is a variable, while the $\text{"c"}$ in $A_{\text{"c"}}$ is the actual character $\text{"c"}$.) Thus, all that remains is to compute $S$. Pairs of similar stringsWe need to count the number of similar strings $X$ and $Y$. Let's fix the string $X$. Note that $X$ has length $n$, and we know how many of each letter it has (from the $A_c$ values). We are interested in finding the number of distinct strings $Y$ that can be obtained by performing at most two swaps in $X$. We will need to distinguish four kinds of strings $Y$:
In this classification, we say "swap" to mean "swap between two different letters" to avoid double counting. Since every kind of similar string is exactly one of the above kinds, we can simply count each kind separately and add the results together. The first kind is actually very trivial, because there is only one string of that kind, namely $X$ itself. So we'll start on the second kind: Strings formed from $X$ using one swap.How many strings $Y$ can be formed from $X$ using exactly one swap of different letters? Let's count it. First, we need to select which letters to swap, say $i$ and $j$. To avoid double counting, we say $i < j$. Next, we need to select which among the $A_i$ and $A_j$ occurrences of $i$ and $j$, respectively, will be swapped. There are $A_i\cdot A_j$ such choices. By summing across all $(i,j)$ pairs, we now have our answer! $$\sum_{\text{"a"} \le i < j \le \text{"z"}} A_iA_j$$ Strings formed from $X$ using two overlapping swaps.Here, by overlapping swaps, we mean, for example, swapping the $i$th and $j$th character, and then the $i$th and $k$th character after that. (Notice that two swaps happened in the $i$th position.) Note that we can safely ignore swaps where $j = k$ because that has the same effect as doing no swaps at all. Let's understand what precisely happens during two overlapping swaps. Let's say the $i$th position contains the character $c_i$, the $j$th position contains $c_j$, and the $k$th position contains $c_k$. After the first swap, the $i$th position contains $c_j$, the $j$th position contains $c_i$, and the $k$th position contains $c_k$. After the second swap, the $i$th position contains $c_k$, the $j$th position contains $c_i$, and the $k$th position contains $c_j$. Thus, it turns out that two overlapping swaps is equivalent to a single 3cycle! So it turns out we are actually counting the number of strings $Y$ that can be obtained from $X$ by doing a 3cycle on three distinct characters. Let's count it. First, we need to select the three distinct letters to 3cycle, say $i$, $j$ and $k$. To avoid double counting, we say $i < j < k$. Next, we need to select which among the $A_i$, $A_j$ and $A_k$ occurrences of $i$, $j$ and $k$, respectively, will be permuted. There are $A_iA_jA_k$ such choices. Finally, we need to figure out the direction of the three cycle. There are two possible directions: $i \rightarrow j \rightarrow k \rightarrow i$ and $i \rightarrow k \rightarrow j \rightarrow i$. Thus, by summing across all $(i,j,k)$ triples, we get our answer: $$\sum_{\text{"a"} \le i < j < k \le \text{"z"}} A_iA_jA_k\cdot 2$$ Strings formed from $X$ using two disjoint swaps.Finally, we now try to count the number of strings that can be formed from $X$ using two disjoint swaps. This is actually the most complicated type, because even though the positions of the swaps themselves are distinct, the letters they're swapping might not be. For example, In the word In fact, we need to classify this type into three subtypes. Let's count them! Type 1: The characters in both swaps are all distinct. In this case, we need to select the four characters involved, say $i$, $j$, $k$ and $l$. To avoid double counting, let's say $i < j < k < l$. Next, we need to select which among the occurrences of $i$, $j$, $k$ and $l$ will be swapped. There are $A_iA_jA_kA_l$ such choices. Finally, we need to choose which letter swaps with which. There are three possibilities: $(i,j),(k,l)$, or $(i,k),(j,l)$, or $(i,l),(j,k)$. Thus, summing across all choices for $(i,j,k,l)$, we get: $$\sum_{\text{"a"} \le i < j < k < l \le \text{"z"}} A_iA_jA_kA_l\cdot 3$$ Type 2: The two swaps share one character in common. In this case, we need to select the three characters involved, say $i$, $j$ and $k$. This time, one of the characters will be different from the others, namely the character that will be shared in both swaps. Let's say this character is $i$. Now, to avoid double counting, we need to let $j < k$. However, $i$'s only restriction is that it must not be equal to $j$ or $k$ (because it is special). Finally, we need to select which among the occurrences of $i$, $j$ and $k$ will be swapped. There are $A_i(A_i1)A_jA_k$ such choices. Note the factor $A_i(A_i1)$, because there are $A_i$ ways to choose the place that $j$ will swap with $i$, and there are $A_i1$ ways to choose the place that $k$ will swap with $i$ (which must be different from the previously chosen position). Thus, summing across all choices for $(i,j,k)$, we get: $$\sum_{\text{"a"} \le i \le \text{"z"}} \sum_{\substack{\text{"a"} \le j < k \le \text{"z"} \\\ i \not= j \\\ i \not= k}} A_i(A_i1)A_jA_k$$ Type 3: The two swaps share two characters in common. In this case, we need to select the two characters involved, say $i$ and $j$. (No one is special this time.) To avoid double counting, let's say $i < j$. Finally, we need to select which among the occurrences of $i$, $j$ will be swapped. There are ${A_i \choose 2}{A_j \choose 2}$ such choices (because we need to choose two positions for each letter). Note that this time, it doesn't matter which occurrence swaps with which because the end result will be the same. Thus, summing across all choices for $(i,j)$, we get: $$\sum_{\text{"a"} \le i < j \le \text{"z"}} {A_i \choose 2}{A_j \choose 2}$$ Combining everythingBy adding all sums above, we now have computed the number of strings that are similar to $X$! Let's denote this number by $T$. The cool thing about what we did is that we didn't use any property specific to any $X$ other than that it was a permutation of $A$. Therefore, the sum $T$ is the same for every permutation $X$ of $A$. Since there are $P$ possible values of $X$ (remember $P$ from above?), it means that $S$, the number of ordered pairs of similar permutations of $A$, is simply $P\times T$! We now all have the necessary things to compute $P^2  S$, so we now have a working algorithm for the problem! Now, what about the time complexity? The first part is computing the $A_i$s which take $O(n)$ time. Also, $P$ itself takes $O(n)$ time to calculate because of the factorials. Finally, $S = PT$, and all calculations for $T$ depend only on the size of the array $[A_{\text{"a"}}, \ldots, A_{\text{"z"}}]$. But the size of this array is fixed at $26$! Therefore, computing $T$ takes $O(1)$ time! The overall algorithm runs in $O(n)$ time. Don't forget to reduce everything modulo $10^9 + 7$! The following is an implementation in Python. You can see here a technique of calculating factorials and inverse factorials modulo $10^9 + 7$.
Note that Python integers are automatically arbitraryprecision, so we don't need to worry about reducing As a final note, the huge amount of nested loops in the code above can be optimized to a single loop using an extensive amount of cumulative sums and dynamic programming. See the editorialist's solution for one way of doing it. Time Complexity:$O(n)$ AUTHOR'S AND TESTER'S SOLUTIONS:
This question is marked "community wiki".
asked 13 Mar '16, 14:26

I couldn't figure out what is the problem in this submission...:( MySolution Only got 75 points :( answered 15 Mar '16, 15:51

Here is my solution https://www.codechef.com/viewsolution/9601814. You may remove the comments to checkk the execution at each step. Happy coding :) answered 15 Mar '16, 16:25

Can you explain  "The two swaps share one character in common" part, with an example maybe ? And total number of pairs(P) should be ((P)(P1))/2 as pointed by @yougaindra , right ? Thanks. answered 15 Mar '16, 21:56
This example is from the editorial. Consider the word "aabc". Suppose we swap the first and third positions, and then the second and fourth positions. The result is "bcaa". These two swaps share one character in common: the character "a". This is what I meant by "The two swaps share one character in common".
(15 Mar '16, 22:19)
Actually I meant the formula for that Case > (Ai)(Ai−1)(Aj)(Ak) , is it like > for 'aabc', 'b' or 'c' can be swapped with 'a' at position 0 or 1(0 based indexing) and at the same time 'b' or 'c' can be swapped with 'a' at 1 position . Am I correct ?
(15 Mar '16, 22:28)
Yes, that is correct.
(15 Mar '16, 23:11)

can you explain in case of 2 disjoint swap's second case of one common character why it has not been multiplied by 2 ? answered 22 Mar '16, 21:53

@jragarwal in the case of two disjoint swap with one common element, lets say the characters we swap are a, b and c where a is the common one i.e. we swap (a, b) and (a, c). Also consider the frequencies of the characters a, b and c are fa, fb, fc respectively. Now, by swapping (a, b) we make fa*fb number of similar strings and for every such swap, we are number of a's left is fa1. So we can make (fa1)*fc number of similar strings by swapping (a, c). So total number of similar strings in this case is fafb(fa1)*fc. Hope this helps. answered 22 Mar '16, 22:19

I complicated this question by trying to find the number of possible unordered pairs and then planning to multiply that by 2. Did not work :( answered 22 Mar '16, 22:26

I don't understand why this problem is in easy level if the author says that it is medium level. answered 29 Mar '17, 09:48

Why the total number of pairs is P^2 ? It should be P * (P + 1) / 2. Right ? answered 21 Nov '17, 12:34
