MAXSEGM - Editorial

PROBLEM LINK:

Practice
Contest

Author: Ramazan Rakhmatullin
Testers: Lewin Gan, Marek Sommer
Editorialist: Pawel Kacprzak

DIFFICULTY:

Easy

PREREQUISITES:

Two pointers

PROBLEM:

You are given 2 arrays W = [W_1, W_2, \ldots, W_N] and C = [C_1, C_2, \ldots, C_N] with N elements each. We say that a range [L, R] is unique if all the elements C_L, C_{L+1}, \ldots, C_R are unique. The goal is to find the maximum possible sum W_L + W_{L+1} + \ldots + W_R for a unique range [L, R].

EXPLANATION:

First of all, notice that in both subtasks, values of W are up to 10^9, which means that the final sum can exceed 32-bit integer range, so you should use 64-bit integers to represent the sums.

Subtask 1

In the first subtask, there are up to 100 test cases to solve, but we know that the sum of N over all these test cases doesn’t exceed 10^4, so we can apply the following quadratic time approach. For all L = 1, 2, \ldots, N, let’s iterate over all R \geq L such that a range [L, R] is unique and update the result during this iteration. The key observation here is that if for some A, the range [L, A] is not unique, i.e. there exists two indices L \leq i < j \leq A, such that C_i = C_j, then any range [L, B] for B > A is also not unique. In order to track uniqueness of elements in a range during iterations, we can use a hash-table, or even better, after noticing that values in C are non-negative integers less than N, we can use a simple boolean array for this purpose.

The following pseudocode illustrates the above appraoch:

result = 0
for L = 1 to N:
    for i = 0 to N-1:
        seen[i] = False # initially all are unseen
    currentSum = 0
    for R = L to N:
        if seen[C[R]]:
            break
        currentSum += W[R]
        result = max(result, currentSum)
        seen[C[R]] = True
print result

The total time complexity of this method is clearly O(N^2) for a single test case, which is sufficient in this subtask.

Subtask 2

In the second subtask, there are also up to 100 test cases to solve, but now the sum over N over these test cases doesn’t exceed 10^6. Thus we need a faster approach, ideally a linear one.

The key observation is to notice that values in the array W are non-negative, which means that if a range [L, R] is unique, and we can extend it in any direction, then we can always do that. It follows, that if we have a unique range [L, R], we want to extend in both directions as much as it stays unique. This leads to the following approach:

Let’s start with L = 1. We are going to use a boolean array for tracking already seen elements in the current range - similarly as in the solution for the first subtask. Now, let’s find maximum R such that the range [L, R] is unique. It means that either R = N or value C[R+1] appears in range [L, R] in array C. We do this by simply iterating from R = L until we find this maximum R. Now we know that [L, R] is a unique range and we can update the result with its sum, which can be computed while iterating. Now comes the second key observation. If R = N we are done, because there are no remaining elements to the right. Otherwise, in order to extend the current range to the right, we have to find the minimum i \geq L, such that C[i] = C[R+1], update L to i+1 and start extending the right endpoint of the range from R while it remains unique. We repeat this process until R becomes N. The following pseudocode illustrates this approach:

result = 0
currentSum = 0
L = 1
R = 1
for i = 0 to N-1:
    seen[i] = False # initially all are unseen
while True:
    while R <= N and not seen[C[R]]:
        currentSum += W[R]
        seen[C[R]] = True
        R += 1
    result = max(result, currentSum)
    if R == N+1:
        break
    while seen[C[R]]:
        seen[C[L]] = False
        currentSum -= W[L]
        L += 1
print result

The time complexity of this approach is O(N) for a single test case because both L and R can be updated at most N times in the loops and the total complexity is dominated by the loops where these values are updated in every iteration. This method is often called “Two pointers” approach and it’s useful in approaching many problems, even very advanced ones.

AUTHOR’S AND TESTER’S SOLUTIONS:

Setter’s solution can be found here.
First tester’s solution can be found here.
Second tester’s solution can be found here.
Editorialist’s solution can be found here.

3 Likes

Could you please tell me where I went wrong?
This is the link to my code CodeChef: Practical coding for everyone

I think my approach was the same. Just instead of a boolean array, I kept checking elements with a set, which is less memory-consuming I think. Still, the code couldn’t pass the second subtask because of TLE. Can anybody point out the mistake?

https://www.codechef.com/viewsolution/14331818

If someone has some test cases cant they please help me out? My code (CodeChef: Practical coding for everyone) got WA I’m not able to understand why.

Thank You

@prajneya
Problem is where you do

right=left

Now your code is like O(n^2)

Instead, you should do

right=left=last occurrence of duplicate element encountered+1

it seems to me that my solution is more ergonomic, and works for O(N)

here is the link : CodeChef: Practical coding for everyone

the main idea was :

  1. in W[i] keep partial summs from W[1] to W[i], so you can easy find LR segment sum by W[R] - W[L - 1]

  2. due to 1 ≤ N ≤ 1000000 , 0 ≤ Ci < N, we are able to have a mask array with N-ranged size and keep in the i-th element of the mask the last index of C[j] that is equal to i

  3. the last movement was :

    const int max_size = 1000001;
    int mask_[max_size];
    int last_duplicate;
    long long max_sum;
    int n;
    long long W[max_size];
    int C[max_size];

for each test we do:

memset(mask_, 0, sizeof(int) * max_size);
max_sum = 0; last_duplicate = 0;

// reading input

// calculating partial summs

for (int i = 1; i <= n; ++i) {
	if (mask_[C[i]] == 0)
		mask_[C[i]] = i;
	else {
	    max_sum = MAX(max_sum, W[i - 1] - W[last_duplicate]);
		last_duplicate = MAX(mask_[C[i]], last_duplicate);
		mask_[C[i]] = i;    		
	}
}
max_sum = MAX(max_sum, W[n] - W[last_duplicate]);

printf("%lld\n", max_sum);
5 Likes

I think my solution is easier to understand and the code is also simple.

This is my solution:

1 Like

I think i have also used a O(n) approach but i am getting a wrong answer.Could anyone help me out what is wrong in my code or for what test cases it is giving wrong output? Here is the link to my code:CodeChef: Practical coding for everyone

is really second approach is O(n)… i can see that, its O(n^2)…is i m right? there exist last while loop as well which is as follows–

 while seen[C[R]]:
    seen[C[L]] = False
    currentSum -= W[L]
    L += 1

here it is somewhat taking extra time than o(n)… can anyOne explain what exactly time Complexity of second approach… because here loop inside loop exists which suggests its O(N^2)

@hemant_dhanuka Regarding time complexity in second approach

Loop 1 can execute maximum of N steps as in each step R increases by one so at (N+1)th loop iteration R<=N condition will be violated and the outer while loop will also break.

Loop 2 can execute maximum of N-1 times. At one loop instance, maximum L can go is upto R-1 and L is never decreased, so sum of all the steps in this loop will be less than R and R can go upto N.

Hope this clears the doubt. I don’t know how much clear was that.

can somone please tell where i went wrong?

here’s my code-https://www.codechef.com/viewsolution/14340260

If you want to know more about two pointers technique, then read it here.

For practice, different level of difficulty problems, refer here, hope it helps.

Can anyone help why one is giving runtime error and other AC

runtime error : CodeChef: Practical coding for everyone

AC : CodeChef: Practical coding for everyone

what am i doing wrong for runtime error solution?

I don’t understand why this is showing Wrong answer…

#include <bits/stdc++.h>
using namespace std;


int occupied[1000001];
long long int wrr[1000001];

int main()
{

ios::sync_with_stdio(false);
int t,n,crr[1000001],pos;
long long int presum[1000001];
long long int sum=0,maxi=0;
cin>>t;
while(t--)
{
    cin>>n;
    sum=0;
    memset(presum,0,sizeof(presum));
    memset(occupied,0,sizeof(occupied));
    maxi=0;
    pos=1;
    for(int i=1;i<=n;i++)
    {
        cin>>crr[i];
    }
    for(int i=1;i<=n;i++)
    {
        cin>>wrr[i];
        presum[i]=presum[i-1]+wrr[i];
    }
    for(int i=1;i<=n;i++)
    {
        if(occupied[crr[i]]==0)
        {
            sum+=wrr[i];
            occupied[crr[i]]=i;
        }
        else
        {

            int temp=presum[occupied[crr[i]]]-presum[pos]+wrr[pos];

            sum=sum-temp+wrr[i];
            pos=occupied[crr[i]]+1;
            occupied[crr[i]]=i;

        }
        if(sum>maxi)
        {
            maxi=sum;
        }
    }
      cout<<maxi<<endl;
   }
   return 0;
   }

In subtask 1 How is O(n^2) Approach possible???
Time Complexity is O(n^2 * T)

Can someone help me out with what’s wrong in my code? It’s in python. I’m getting WA.
Here is the link: 14403325

1
6
4 4 5 5 6 6
1 2 3 4 5 6
Ans should be 9 and not 5

I updated the link to my problem. Pl check

Thanks a lot siddharth bro :slight_smile:

No set uses BST and each insertion or deletion in BST uses /logn time