Count all substring which can be made palindrome in string

ssjgz · August 24, 2019, 11:50am

Does anyone have any objection to my posting it now? I don’t want to steal @anon55659401 's thunder

I suppose any ongoing contest can be assumed to have finished by now?

Edit:

So it looked like the consensus was that people would post solutions “in a few hours”, and that was more than a few hours ago

This is a bit seat-of-the-pants - I think the approach is sound (it survives plenty of randomised testing) but the documentation might not be 100%:

CPP Solution

#define SUBMISSION
#define BRUTE_FORCE
#ifdef SUBMISSION
#undef BRUTE_FORCE
#define NDEBUG
#endif
#include <iostream>
#include <vector>
#include <map>

#include <cassert>

#include <sys/time.h> // TODO - this is only for random testcase generation.  Remove it when you don't need new random testcases!

using namespace std;

template <typename T>
T read()
{
    T toRead;
    cin >> toRead;
    assert(cin);
    return toRead;
}

int64_t solveBruteForce(const string& s)
{
    int64_t result = 0;

    for (int i = 0; i < s.size(); i++)
    {
        for (int j = i; j < s.size(); j++)
        {
            const int numLetters = 26;
            int letterHistogram[numLetters] = {};
            for (int k = i; k <= j; k++)
            {
                letterHistogram[s[k] - 'a']++;
            }
            int numLettersWithOddOccurrence = 0;
            for (int letterIndex = 0; letterIndex < numLetters; letterIndex++)
            {
                if ((letterHistogram[letterIndex] % 2) == 1)
                    numLettersWithOddOccurrence++;
            }
            if (numLettersWithOddOccurrence == 0 || numLettersWithOddOccurrence == 1)
            {
                cout << "Palindromic: " << s.substr(i, j - i + 1) << endl;
                result++;
            }
        }
    }
    
    return result;
}

int64_t solveOptimised(const string& s)
{
    // Let the "bit value" of a letter a, b, c ... be defined as
    //
    //   2 ** (distance of letter from 'a')
    //
    // i.e. 
    //
    //  bit_value(a) == 2 ** 0 = 1
    //  bit_value(b) == 2 ** 1 = 2
    //  bit_value(c) == 2 ** 2 = 4
    //
    // etc.
    //
    // Let the "xor sum" of a substring
    //
    //   a_i a_(i+1) ... a_j 
    //
    // be defined as 
    //
    //   xorSum (a_i a_(i+1) ... a_j) = bit_value(a_i) ^ bit_value(a_(i+1)) ^ ... ^ bit_value(a_j)
    //
    // Then a substring s will be a palindrome if and only if the binary representation
    // of xorSum(s) has *at most* one bit set.
    //
    // Let prefixXorSum(k) = xorSum(a1 a2 ... a_k); then note that
    //
    //   xorSum(a_i a_(i+1) ... a_k) = prefixXorSum(i - 1) ^ prefixXorSum(k)
    //
    // Thus, to find the number of palindromic-substrings ending at a_k, we need to calculate prefixXorSum(k)
    // and then find the number of i's such that prefixXorSum(i) ^ prefixXorSum(k) has at most one bit set
    // i.e. the number of i's such that prefixXorSum(i) ^ prefixXorSum(k) is either 0, or a power of 2.
    // This is pretty easy.
    int64_t result = 0;

    // We can probably use a large (2 ** 26 * sizeof(int) ~ 256MB) lookup table instead of a std::map, here,
    // giving us asymptotically better runtime performance at the expense of memory usage.
    map<uint32_t, int> numPrefixesWithXorSum;
    numPrefixesWithXorSum[0] = 1; // Empty prefix.

    uint32_t prefixXorSum = 0;
    for (int k = 0; k < s.size(); k++)
    {
        const int letterIndex = s[k] - 'a';
        prefixXorSum = prefixXorSum ^ (1 << letterIndex); 

        result += numPrefixesWithXorSum[prefixXorSum]; // Num substrings ending at k with xorSum == 0.
        for (int i = 0; i < 26; i++)
        {
            result += numPrefixesWithXorSum[prefixXorSum ^ (1 << i)]; // Num substrings ending at k with the ith bit of their xorSum set.
        }

        numPrefixesWithXorSum[prefixXorSum]++;
    }
    
    return result;
}


int main(int argc, char* argv[])
{
    ios::sync_with_stdio(false);
    if (argc == 2 && string(argv[1]) == "--test")
    {
        struct timeval time;
        gettimeofday(&time,NULL);
        srand((time.tv_sec * 1000) + (time.tv_usec / 1000));

        const int N = rand() % 1'000 + 1;
        const int maxLetter = rand() % 26 + 1;

        cout << 1 << endl;
        //cout << N << endl;
        for (int i = 0; i < N; i++)
        {
            cout << static_cast<char>('a' + rand() % maxLetter);
        }
        cout << endl;

        return 0;
    }
    
    const int T = read<int>();
    
    for (int t = 0; t < T; t++)
    {
        const string s = read<string>();
#ifdef BRUTE_FORCE
        const auto solutionBruteForce = solveBruteForce(s);
        cout << "solutionBruteForce: " << solutionBruteForce << endl;
        const auto solutionOptimised = solveOptimised(s);
        cout << "solutionOptimised: " << solutionOptimised << endl;

        assert(solutionOptimised == solutionBruteForce);
#else
        const auto solutionOptimised = solveOptimised(s);
        cout << solutionOptimised << endl;
#endif
    }

    assert(cin);
}

raisinten · August 24, 2019, 11:55am

I hope its not from any long contest

raisinten · August 24, 2019, 12:22pm

@ssjgz What a solution! Just blew my mind! I wouldn’t have come up with that even if I were given a week to think. I wonder if there’s any other method of solving this without using xor hacks. Or is it maybe a standard problem to be solved using the properties of xor alone? Thank you!

ssjgz · August 24, 2019, 12:32pm

The key to it is to realise (I actually just “knew” this from previous problems) that a) two strings are anagrams if and only if their “letter histograms” are equal and b) a string is a palindrome if and only if it’s letter histogram is either all even numbers or just one odd number i.e. we only care about letter parity.

If we think of parity of a letter, it’s a small step to thinking about representing a substring’s parity as a binary number. Then I just went through a mental checklist of known techniques, one of the first of which (by sheer good luck!) was the Olde “Prefix Sum” Trick - how can we represent (and easily update) the “letter-parity-histogram” binary number of a prefix of a string? Appending a letter will toggle the parity of that letter in the letter-parity-histogram, and “toggle” usually implies either “not” or “xor”. Thinking about xor’s and prefix xor sums made me think of the recent Guddu and his Mother, and the rest soon fell into place

So it was mainly a case of following your nose and thinking about how to adapt well-known techniques to work with what you’ve found - plus a hefty dollop of luck

I’m betting you would have thought of it within a week

raisinten · August 24, 2019, 12:37pm

@ssjgz Very well explained! Yea, to be honest, I did get some Guddu and his Mother feels while going through this. Haha! Thanks for the betting on me. Cheers!

s5960r · August 24, 2019, 12:44pm

Awesome ! The thought process that leads to a solution is lot valuable. Btw if u don’t mind may i ask if this your first codechef handle? How much years of experience you have in sport programming?

ssjgz · August 24, 2019, 12:46pm

Thanks! Yes, it’s my first codechef handle - I migrated from Hackerrank as they’ve pretty much stopped doing contests.

My first actual contest was in November 2017, but I’d solved many, many practice problems by then

s5960r · August 24, 2019, 12:50pm

It’s great ! I’ve been reading all your posts lately ! Keep them coming, learned something new today Thanks!

ssjgz · August 24, 2019, 12:52pm

Very much appreciate the kind words - many thanks indeed

shivamchef · August 24, 2019, 3:10pm

Really an amazing use of xor. @karangreat and @l_returns excited to see your approaches also.

l_returns · August 25, 2019, 4:42am

Mine was O(n^2)
@ssjgz’s approach is awesome.

anon55659401 · August 25, 2019, 4:52am

Mine is ditto same as that of @ssjgz .

ssrivastava990 · August 25, 2019, 5:00am

@anon55659401 bro but u told your complexity is O(26*N) is is same as above solution

ssjgz · August 25, 2019, 11:48am

Depending on whether you use a std::map or a vector for numPrefixesWithXorSum, the complexity is either:

a) O(N \times |\Sigma| \times \log_2 N) time and O(N) space; or
b) O(N \times |\Sigma|) time and O(N + 2^{|\Sigma|}) space

ssrivastava990 · August 25, 2019, 11:57am

just tell me a simple thing ( don’t mind i m not so good coder so ) can we did in O(26*N) only
and the approach u told is very nice but can we do it in simpler more understable way for new beginners or those who don’t know about these xor properties

ssjgz · August 25, 2019, 11:59am

You can do it in O(N \times |\Sigma|) time if you don’t mind using O(N + 2^{|\Sigma|}) space for your lookup array/ vector (where |\Sigma| is the size of our alphabet i.e. 26 in this case).

I personally don’t know whether there is a simpler way of doing it.

ssrivastava990 · August 25, 2019, 12:01pm

You can do it in O(N×∣Σ∣)O(N \times |\Sigma|)O(N×∣Σ∣) time if you don’t mind using O(N+2∣Σ∣)O(N + 2^{|\Sigma|})O(N+2∣Σ∣) space for your lookup array/ vector
how??

ssjgz · August 25, 2019, 12:03pm

I’ve not tried it myself, but see the comment in the code: change numPrefixesWithXorSum to be a std::vector instead of a std::map and resize it to the appropriate size (2^{26}). Should work, I think.

raisinten · August 25, 2019, 4:15pm

I just found a solution that uses the exact same strategy as yours. Here’s the link. I’m starting to think that mighty xor is the only way of accomplishing our task.

ssjgz · August 25, 2019, 4:16pm

Yeah, I really can’t think of any other way of doing it. Good find