Yahoo Hackathon DP Problem

prasu_newbie · June 22, 2013, 6:18pm

There is a sequence of N random integers, which can contain any integer between -10 and 10 (both of which are inclusive). The task is to find the number of continuous sequences of integers such that their sum is -1 or +1

1 <= n <= 10^6

kuruma · June 22, 2013, 9:39pm

Hello,

As the use of DP is needed, I say that I also don’t know how to solve this problem, but, usually, a recurrence relationship needs to be derived, especially one that can exploit the property of optimal substructure…

For instance, let’s try to denote:

S(i,j) -> sum of the elements of the array that start at element i with length j;

Also let our array be denoted as Arr.

As we want a contiguous subsequence, we can have three cases to consider now:

Case 1

S(i,j) = -1 or S(i,j) = 1, case where we add this sequence to our total sequence counter.

Case 2

S(i,j) + Arr[i+j+1] > 1 or S(i,j) + Arr[i+j+1] < -1;

This case means that, as the next element on our array, (remember we start on element i and have a sequence of length j) A[i+j+1] is not a suitable candidate to form an admissable sequence, so we can possibly restart our sequence counter as S(i+j+1, vl), where vl denotes the variable length that our new subsequence can take.

Case 3

A case where S(i,j) + Arr[i+j+1] = -1 or S(i,j) + Arr[i+j+1] = 1, case where we should count this sequence and where we have S(i,j+1) being similar to case 1.

Possibly there might be more cases or this formulation might be incorrect, and maybe someone more skilled than me can help you better…

I just wrote the formulation that came to my head.

This seems like a mix between Kadane’s algorithm and the subset sum problem, but I am yet too newbie in DP to help in a more specific way…

Best regards,

Bruno

mjnovice · June 22, 2013, 10:28pm

I think the following code can solve the problem. If any explanation is wanted, please ask I shall explain.

#include<memory.h>
#include<stdio.h>
#define OFFSET 1000000
#define LIMIT 1000000
main()
{
int sumupto[LIMIT+2];
int countsum[20*LIMIT+2];//mapping is as 0->-10000000 and 20000000-> +10000000
int n,num;
unsigned long long int ans=0;
int sum=0;
scanf("%d",&n);
for(int i=0;i<20*LIMIT+2;i++)countsum[i]=0;
for(int i=0;i<n;i++)
{
    scanf("%d",&num);
    sum+=num;
    sumupto[i]=sum;
    ans+=countsum[sumupto[i]-1+OFFSET];
    ans+=countsum[sumupto[i]+1+OFFSET];
    if(sumupto[i]==1 or sumupto[i]==-1)
        ans+=sumupto[i];
    countsum[sumupto[i]+OFFSET]++;
}
printf("%llu",ans);
return 0;
}

Heres the explanation.
I have an array of integers of size n where 1<n<10^6 and each element lies in the range -10 to 10.
The array element sumupto[i] keeps the sum upto the index i in the given array.
Now we also see that the maximum possible value that can ever occur in the array sumupto is 10^7 (ie when n=10^6 and all elements are 10) also, the minimum possible value would be -10^7.
Now the array element countsum[i] gives the number of entries with value i in the array sumupto, and so its size must be atleast 2*10^7, and use of offset 10^7 to keep track of the number of negative entires.
To find a continuous subsequence, whose sum equals 1 or -1 we simply need to find the number of instances such that abs(sumupto[i]-sumupto[j])=1 where 1<i<n and 1<j<n. In doing this we would be counting each such subsequence twice.
To do this we simply use the hashtable array countsum, as the number of entries with value = sumupto[i]+1 or value = sumupto[i]-1 would give us the answer, but again with each sequence counted twice.
Thus we get our answer!

anm217 · June 22, 2013, 10:46pm

#include<stdio.h>
#include<math.h>
int main(){
int t,n;
scanf("%d",&t);
while(t–){
scanf("%d",&n);
int arr[n],i,j,count=0,c,sum;
for(i=0;i<n;i++)
scanf("%d",&arr[i]);
for(i=0; i<n; i++)
{
//if(arr[i]==1 || arr[i]==-1) count++;
sum=0;
for(j=i; j<n ;j++)
{
sum = sum + arr[j];
if(sum==1 || sum==-1) count++;
}
}
printf("%d\n",count);
}
return 0;
}

c0d3junki3 · June 23, 2013, 12:28pm

Well since you didn’t give any memory limit, i’m just going to assume it’s 256MB.

Let dp[i] be the sum from 0…i
for example : dp[2] = array[0] + array[1] + array[2].
Calculate the dp array : dp[i] = dp[i - 1] + array[i].
Let cnt[dp[i]] be the number of times dp[i] appears.Array cnt ranges from -10.000.000 to 10.000.000.
Iterate through every dp[i], i = 0…n-1 and to the answer add cnt[dp[i]+1] and cnt[dp[i]-1]. Reason is you are trying to find a number of pairs of indices (i, j) such that j < i and dp[i] - dp[j] = -1 or 1. Don’t forget to use a 64bit integer as an answer. Also don’t forget to count every dp[i] that’s 1 or -1.

Here is a simple pascal code.

var cnt:array[-10000000..10000000] of integer;
    dp:array[0..1000000] of integer;
    currNum, N, i:integer;
    ans:int64;
begin
  read(N);
  for i := 0 to N - 1 do begin
    read(currNum);
    if (i <> 0) then dp[i] := dp[i - 1];
    dp[i] := dp[i] + currNum;
    ans := ans + cnt[dp[i] + 1];
    ans := ans + cnt[dp[i] - 1];
    if (dp[i] = -1) or (dp[i] = 1) then ans := ans + 1;
    cnt[dp[i]] := cnt[dp[i]] + 1;
  end;
  writeln(ans);
end.

Complexity : O(N).

Memory used : 41248 kB.

T9v3Kc - Online Pascal Compiler & Debugging Tool - Ideone.com -
ideone link where you can test the code.

sumanth232 · June 23, 2013, 2:50pm

Your logic will fail at an edge case.
If there is only one number in the given sequence- the number being 1,
ur code gives ans=0

sumanth232 · June 23, 2013, 2:53pm

What I wanted to point is-
Let the sequence be from index i to index j -
that is a[i] + a[i+1] + … + a[j] = -1 or 1
then all those sequences where i=0 will not be recognized

sumanth232 · June 23, 2013, 3:36pm

Brilliant answer. Cleverly avoiding the double-counting and took care of the corner case.

mjnovice · June 23, 2013, 8:42pm

@sumanth32 yep, you are right, I ignored the cases wherein the sum of sequences is 1 or -1, also if I calculate as I input the numbers I would avoid the error of double counting, which is so nicely presented by c0d3junki3. Correcting it!

vikasnitt · August 14, 2013, 12:45am

This is a simple Brute Force you are doing. Try to optimize the problem by using DP.