Yahoo Hackathon DP Problem

dynamic-programming

#1

There is a sequence of N random integers, which can contain any integer between -10 and 10 (both of which are inclusive). The task is to find the number of continuous sequences of integers such that their sum is -1 or +1

1 <= n <= 10^6


#2

Hello,

As the use of DP is needed, I say that I also don’t know how to solve this problem, but, usually, a recurrence relationship needs to be derived, especially one that can exploit the property of optimal substructure…

For instance, let’s try to denote:

S(i,j) -> sum of the elements of the array that start at element i with length j;

Also let our array be denoted as Arr.

As we want a contiguous subsequence, we can have three cases to consider now:

Case 1

S(i,j) = -1 or S(i,j) = 1, case where we add this sequence to our total sequence counter.

Case 2

S(i,j) + Arr[i+j+1] > 1 or S(i,j) + Arr[i+j+1] < -1;

This case means that, as the next element on our array, (remember we start on element i and have a sequence of length j) A[i+j+1] is not a suitable candidate to form an admissable sequence, so we can possibly restart our sequence counter as S(i+j+1, vl), where vl denotes the variable length that our new subsequence can take.

Case 3

A case where S(i,j) + Arr[i+j+1] = -1 or S(i,j) + Arr[i+j+1] = 1, case where we should count this sequence and where we have S(i,j+1) being similar to case 1.

Possibly there might be more cases or this formulation might be incorrect, and maybe someone more skilled than me can help you better…

I just wrote the formulation that came to my head.

This seems like a mix between Kadane’s algorithm and the subset sum problem, but I am yet too newbie in DP to help in a more specific way…

Best regards,

Bruno


#3

I think the following code can solve the problem. If any explanation is wanted, please ask I shall explain.

#include<memory.h>
#include<stdio.h>
#define OFFSET 1000000
#define LIMIT 1000000
main()
{
int sumupto[LIMIT+2];
int countsum[20*LIMIT+2];//mapping is as 0->-10000000 and 20000000-> +10000000
int n,num;
unsigned long long int ans=0;
int sum=0;
scanf("%d",&n);
for(int i=0;i<20*LIMIT+2;i++)countsum*=0;
for(int i=0;i<n;i++)
{
    scanf("%d",&num);
    sum+=num;
    sumupto*=sum;
    ans+=countsum[sumupto*-1+OFFSET];
    ans+=countsum[sumupto*+1+OFFSET];
    if(sumupto*==1 or sumupto*==-1)
        ans+=sumupto*;
    countsum[sumupto*+OFFSET]++;
}
printf("%llu",ans);
return 0;
}

Heres the explanation.
I have an array of integers of size n where 1<n<10^6 and each element lies in the range -10 to 10.
The array element sumupto* keeps the sum upto the index i in the given array.
Now we also see that the maximum possible value that can ever occur in the array sumupto is 10^7 (ie when n=10^6 and all elements are 10) also, the minimum possible value would be -10^7.
Now the array element countsum* gives the number of entries with value i in the array sumupto, and so its size must be atleast 210^7, and use of offset 10^7 to keep track of the number of negative entires.
To find a continuous subsequence, whose sum equals 1 or -1 we simply need to find the number of instances such that abs(sumupto
-sumupto[j])=1 where 1<i<n and 1<j<n. In doing this we would be counting each such subsequence twice.
To do this we simply use the hashtable array countsum, as the number of entries with value = sumupto*+1 or value = sumupto*-1 would give us the answer, but again with each sequence counted twice.
Thus we get our answer!


#4

#include<stdio.h>
#include<math.h>
int main(){
int t,n;
scanf("%d",&t);
while(t–){
scanf("%d",&n);
int arr[n],i,j,count=0,c,sum;
for(i=0;i<n;i++)
scanf("%d",&arr*);
for(i=0; i<n; i++)
{
//if(arr*==1 || arr*==-1) count++;
sum=0;
for(j=i; j<n ;j++)
{
sum = sum + arr[j];
if(sum==1 || sum==-1) count++;
}
}
printf("%d
",count);
}
return 0;
}


#5

Well since you didn’t give any memory limit, i’m just going to assume it’s 256MB.

  1. Let dp* be the sum from 0…i
    for example : dp[2] = array[0] + array[1] + array[2].
    Calculate the dp array : dp* = dp[i - 1] + array*.

  2. Let cnt[dp*] be the number of times dp* appears.Array cnt ranges from -10.000.000 to 10.000.000.

  3. Iterate through every dp*, i = 0…n-1 and to the answer add cnt[dp*+1] and cnt[dp*-1]. Reason is you are trying to find a number of pairs of indices (i, j) such that j < i and dp* - dp[j] = -1 or 1. Don’t forget to use a 64bit integer as an answer. Also don’t forget to count every dp* that’s 1 or -1.

Here is a simple pascal code.

var cnt:array[-10000000..10000000] of integer;
    dp:array[0..1000000] of integer;
    currNum, N, i:integer;
    ans:int64;
begin
  read(N);
  for i := 0 to N - 1 do begin
    read(currNum);
    if (i <> 0) then dp* := dp[i - 1];
    dp* := dp* + currNum;
    ans := ans + cnt[dp* + 1];
    ans := ans + cnt[dp* - 1];
    if (dp* = -1) or (dp* = 1) then ans := ans + 1;
    cnt[dp*] := cnt[dp*] + 1;
  end;
  writeln(ans);
end.

Complexity : O(N).

Memory used : 41248 kB.

http://ideone.com/T9v3Kc -
ideone link where you can test the code.


#6

Your logic will fail at an edge case.
If there is only one number in the given sequence- the number being 1,
ur code gives ans=0


#7

What I wanted to point is-
Let the sequence be from index i to index j -
that is a* + a[i+1] + … + a[j] = -1 or 1
then all those sequences where i=0 will not be recognized


#8

Brilliant answer. Cleverly avoiding the double-counting and took care of the corner case.


#9

@sumanth32 yep, you are right, I ignored the cases wherein the sum of sequences is 1 or -1, also if I calculate as I input the numbers I would avoid the error of double counting, which is so nicely presented by c0d3junki3. Correcting it!


#10

This is a simple Brute Force you are doing. Try to optimize the problem by using DP.