You are not logged in. Please login at www.codechef.com to post your questions!

×

# CHEFLKJ - Editorial

Author: Misha Chorniy
Tester: Karan Aggarwal
Editorialist: Pushkar Mishra

Medium

Segment Trees

# PROBLEM:

Given is an array $A$ of length $N$. There are two types of operations that we need to handle on this array:

• 1 $x$ $y$: change the value of $A[x]$ to $y$, i.e., $A[x] = y$.
• 2 $l$ $r$: Answer in a "Yes" or a "No" whether the subarray $A[l..r]$ is dominating or not.

Here dominating means that there is a number in the subarray that appears in the subarray at least as many times as one more than half of the length of the subarray.

# EXPLANATION:

The nature of the problem clearly hints towards segment trees. We need to efficiently perform two kinds of operations: update a value at an index, and answer for a subarray whether there exists a number in it which appears at least (one more than half its length) times in it.

Clearly, There isn't a notion of an elegant lazy propagation here since the updates we perform are to single elements.

Let us start by making some observations about dominant arrays and subarrays. Let us consider the entire array $A[1..N]$. The first important observation is that it will only be dominating if either subarray $A[1..\frac{N}{2}]$ is dominating or subarray $A[\frac{N}{2}+1..N]$ is dominating. This is simple to prove. We use proof by contradiction. Assume that neither of the subarrays are dominatinng. That means all the numbers in either of the subarrays appear at maximum $\frac{N}{4}$ times only. That means, if we combine the two arrays, there is no number which appears more than $\frac{N}{2}$ times. This is a contradiction. Therefore, $A$ is not dominating.

We can further extend the observation to say that if $A[1..N]$ is dominating, then the dominating number must either be the dominating number of $A[1..\frac{N}{2}]$ or the dominating number of $A[\frac{N}{2}+1..N]$. This follows from the same arguments that we made above.

This leads us to our method of building our segment tree. Each node of the segment tree stores 2 fields: an integer 'Dominant' which stores the dominant number in the subarray covered by the node, and a map (it can be a hash map for better bounds) 'CountOccurrences' which counts the number of times each number appears in the subarray covered by the node. The memory used in this is $\mathcal{O}(N\log N)$ because each number is stored in $\log N$ nodes only. So memory isn't an issue in our structure at all.

The pseudocode for the build operation is pretty straightforward now:

void buildNode (node r):

if(r == leaf) {
//let us say that this leaf corresponds to A[index] of the given array.

SegTree[r].CountOccurrences.clear(); // clear map of any garbage

// Increment the occurrence of the number at this leaf
SegTree[r].CountOccurrences[a[x]] += 1;
SegTree[r].Dominant = a[x]; // a number dominates its own cell

return;
}

if(r != leaf) {
buildNode(r->left);
buildNode(r->right);

length = r->length; // length of the subarray at this node.

// We merge the maps of children. Merging simply means adding the
// number of occurrences of each number in either of the children maps
// to the parent map.
SegTree[r].CountOccurrences = merge(SegTree[r->left].CountOccurrences,
SegTree[r->right].CountOccurrences);

if(SegTree[r].CountOccurrences[SegTree[r->left].Dominant] > length/2) {
SegTree[r].Dominant = SegTree[r->left].Dominant;

} else if (SegTree[r].CountOccurrences[SegTree[r->right].Dominant] > length/2) {
SegTree[r].Dominant = SegTree[r->right].Dominant;

} else {
// indicates that subarray at this node is non-dominating.
SegTree[r].Dominant = -1;
}
}
}


The update routine is very similar. We just subtract 1 from the number of occurrences of the old value and add 1 to the number of occurrences of the new value at the correspoing leaf plus all its ancestors. Recalculating the 'Dominant' variable is done in the same way as the build function.

With this data structure, how can we query whether a subarray is dominant or not. We use the same logic as before. To query a subarray, we will be visiting at most $\mathcal{O}(\log N)$ nodes in the segment tree. Each of these nodes will have its dominant number (or -1 in case it is not dominating). If the subarray we are querying about is dominating then the dominating number must be one of the dominating numbers of the nodes we visit in the query operation. Therefore, the first thing to find out is the nodes that the query function is going to visit in the segment tree. Then for the dominating number of each of them, we just count how many times they appear over all the nodes that the query function visited. If any of the dominating numbers appeared (more than half the length of the subarray) times, then the subarray is dominating; otherwise no.

What is the complexity of this approach? Build is same as the standard segment tree function which takes $\mathcal{O}(N\log N)$. This is when we are using hash maps in our segment tree nodes. If we use normal maps, then the complexity will be $\mathcal{O}(N\log^2 N)$. Similarly, update takes $\mathcal{O}(\log N)$ with hash maps and $\mathcal{O}(\log^2 N)$ with normal maps. The query function on the segment tree itself is $\mathcal{O}(\log N)$ but we need to count the occurrences of each of $\mathcal{O}(\log N)$ dominating numbers in the $\mathcal{O}(\log N)$ nodes queried. Therefore, the total complexity per query is $\mathcal{O}(\log^2 N)$ with hash maps and $\mathcal{O}(\log^3 N)$ with normal maps.

# ALITER

There is one more way to solve this problem that uses randomisation. For a subarray $A[l..r]$, choose any number $x$ randomly; the probability that this number is the dominating number is $\frac{1}{2}$. We can check by counting the number of its occurrences using the $CountOccurrences$ structure we build before. If it is not the dominating number, we can choose some other number and try. So, in $i$ such iterations, the probability that we don't find the dominant number is $\frac{1}{2^i}$. As we can see, this decreases exponentially. Thus, after 20 or so iterations, if we find that there is a dominant number, then we output that number, otherwise, we can be pretty sure that there isn't a dominant number, i.e., the subarray isn't dominating.

Please see editorialist's/setter's program for implementation details.

# COMPLEXITY:

$\mathcal{O}(N\log^2 N)$

# SAMPLE SOLUTIONS:

This question is marked "community wiki".

asked 23 Jul '16, 19:45 1.3k156581
accept rate: 4% 2.5k53137170

Can anybody tell me how to define structure of segment treee such that i have an array of occurences of variable length in c. Also plz tell me how to define map in c.

(25 Jul '16, 16:38) 3★

 20 it can be solved in $O((N+Q)logN)$ as well Suppose we have an array with N elements , if you pick up any 2 different elements with unequal value and remove them from the array and keep doing this until only 1 distinct value is left , that value is the only possible candidate for answer Now To make this faster , we store this information in segtree , segtree node denotes that if we do this stuff on segment [l , r] , what will be the value left and its frequency. Now we have to check the frequency of an element from l to r , we can either do it online with a bst or do it offline with a simple bit in $nlogn$ time. code answered 25 Jul '16, 00:29 1.0k●1●13 accept rate: 4%
 11 My solution uses square root decomposition. For segments of length < 2000, just loop over and calculate frequency of elements. Report answer. For segments of length > 2000, the dominant element should have frequency > 1000. There will be at most 200 such elements. So I maintain one BIT (binary index tree, or seg tree) for each of those elements, loop over all 200, find the number of elements in range l..r using BIT. Update is 2 fold, update the base array. Also update BIT. Update is O(log N), you update at most one BIT. Query is O(2000) or O(200 * log N). answered 25 Jul '16, 00:32 2.2k●7●26●33 accept rate: 20% 6 Seems, like, you can use Square Root Decomposition on each and every question :P :P (25 Jul '16, 00:35) lohit_974★ Won't there be total 500 such elements with frequency > 1000? (25 Jul '16, 00:36) Initially we have 10^5 elements. Even if you consider the 10^5 query operations as "append" operation instead of replace, at the end you would have 2 * 10^5 elements => at most 200 with >= 1000 frequency (25 Jul '16, 00:42) @anudeep2011, yes, you are right. (25 Jul '16, 00:47) This was my solution too. (25 Jul '16, 16:29) xellos07★
 7 I have also passed with (somehow) "not good" complexity. Worst case [H - means complexity of hash-map]: Update query: O(H^2log(10^9)) Get query: O(sqrt(N)H+sqrt(N)Hlog(10^9)*log(N)) Code (well just to see, it is barely readable :/ ) Thought: I had (for each distinct numbers) a Fenwick-Tree [BIT Tree] in which I held positions. I also held a "priority queue" in which I stored numbers sorted by total frequency. Get query was then to choose between two: A) If the range was lesser than sqrt(N), I brute-forced it. B) If the range was higher than sqrt(N), I looked upon all distinct numbers which had frequency greater than "RANGE/2" and checked (by Fenwick) how many are there on my range. (PS: there could be at most sqrt(N) of these numbers) Anyway I'm wondering how it managed to pass TLE :) Have nice day ^_^ answered 25 Jul '16, 03:11 7★morass 41●1 accept rate: 0%
 4 Even Solution using maps (Complexity O(log^3n) per dominant query) pass within time limit. Here is code. answered 25 Jul '16, 02:24 6★likecs 3.7k●24●81 accept rate: 9%
 2 I feel the explanation for dominant sub-arrays in the editorial is insufficient since it deals with only sub-arrays of size N/2. This is not always the case - especially in case of queries where the query range can span multiple segments(sub-arrays). I think this would be better way to put it, Over a range(i,j) of length N, let there be k sub-arrays {1,2,3..,k} each of length {n1,n2,...nk} respectively. If a majority element 'e' exists in range(i,j) then one of the k sub-arrays must have e as the majority element. Proof is by contradiction (along the same lines as in the editorial), n1 + n2 + ... + nk = N Assume e exists which means freq(e) in (i,j) > N/2 Now lets say none of the k sub-arrays have e as the majority element. Then, freq(e) in sub-array {1] <= n1/2 freq(e) in sub-array {2} <= n2/2 ... freq(e) in sub-array{k} <= nk/2 Therefore, freq(e) in (i,j) <= n1/2 + n2/2 + .. + nk/2 <= N/2 which contradicts our assumption that e is a majority element. So at least one of the k sub-arrays must have e as the majority element. Thanks! answered 25 Jul '16, 14:30 5★noob333 104●1●4 accept rate: 0%
 2 Can someone please explain the meaning of the phrase 'elegant lazy propagation' in the line 'Clearly, There isn't a notion of an elegant lazy propagation here since the updates we perform are to single elements' answered 25 Jul '16, 22:14 103●6 accept rate: 5%
 1 Editorialist's solution is too helpful! answered 17 Feb '17, 08:41 35●2 accept rate: 0%
 0 Was (O(log n)^3) supposed to pass with normal maps? I tried this by maintaing a sorted subarray in every subsegment and then counting kth element in given subsegment which runs in ((logn^2 * log(10^9)) per query, but it gave tle answered 25 Jul '16, 00:11 6★vsp4 1.2k●1●3●8 accept rate: 28% No, it was not supposed to pass. (25 Jul '16, 00:24) 2 @vsp4, I think your solution gives tle because of the ordered statistic set used in your solution. Though, the complexity of ordered statistic set is as desired, but constants are too large. It happened with me too when orderedd statistic set gave tle, but bit with binary search passed easily. (25 Jul '16, 02:27) likecs6★ @vsp4 Can you tell me what is the idea of your solution? (25 Jul '16, 05:09) @rishivikram Find the middle element of the sorted segment [l, r]. If there is a dominator this middle element has to be that dominator. Finding kth element could be done in (logn)^3 time in subarray with updates. Then count how many times this element appears in the array [l,r] This could be done in (logn)^2 (25 Jul '16, 12:26) vsp46★ @vsp4 Kth element in sorted segment can be found in O((logN)^2) by merge sort tree. I don't know how to handle updates though, how do you find it in O((logN)^3) when updates are there as well? (25 Jul '16, 16:12) @rishivikram For updates, erase the old element from each of the logn segments where the given index occurs and add the new element. It takes (logn)^2 time with a bst at each node of segment tree. I also don't know how to do updates with merge sort tree online. However i think a offline approach is possible by creating the tree with all possible values initially and erasing/inserting positions at old/new values respectively. (25 Jul '16, 16:41) vsp46★ I tried with approach similar to merge sort tree: https://www.codechef.com/viewsolution/10904469 But again got TLE with both operations on (log (n + q))^2. The order statistics bst seems to have very high constant!!! It passes only if i distinct b/w heavy and light elements: https://www.codechef.com/viewsolution/10904880 (25 Jul '16, 19:19) vsp46★ showing 5 of 7 show all
 0 Could anyone tell me where I could be going wrong in this solution? It is very similar to the editorial and though the running time is O(Q * (log N)^3), I am getting a WA verdict and not TLE. So could anyone tell me if there's something wrong with the logic? answered 25 Jul '16, 21:27 5★chinmay2 64●3 accept rate: 12%
 0 I have doubt in complexity for each query.Anyone please explain how complexity is O(logn^3) by map? answered 27 Jul '16, 11:24 1 accept rate: 0%
 0 getting WA for this link text code. Any help? answered 27 Jul '16, 19:34 21 accept rate: 0%
 0 How can a segment be dominating if either of subarrays are dominating?? Consider a case [1,2,3,4][2,2,2,5],here 2nd subarray is dominating but when we merge them , the resulting array is not dominating!! answered 18 Aug '16, 21:06 3★akshayv3 157●8 accept rate: 4%
 toggle preview community wiki:
Preview

### Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

Markdown Basics

• *italic* or _italic_
• **bold** or __bold__
• image?![alt text](/path/img.jpg "title")
• numbered list: 1. Foo 2. Bar
• to add a line break simply add two spaces to where you would like the new line to be.
• basic HTML tags are also supported
• mathemetical formulas in Latex between \$ symbol

Question tags:

×15,852
×2,655
×1,767
×59

question asked: 23 Jul '16, 19:45

question was seen: 7,000 times

last updated: 17 Feb '17, 08:41