LADCOMP - Editorial


Contest Division 1
Contest Division 2
Contest Division 3

Setter: Yuriy Fedorov
Tester: Aryan
Editorialist: Taranpreet Singh




Euler tour of tree and LCA using RMQ would do.


Given a tree with N nodes numbered from 1 to N rooted at 1, where each node i \geq 2 has it’s parent node P_i < i. This way, first i nodes form a tree rooted at 1 for each i.

Let’s define the depth array h as follows: h_1 = 1, and h_i = h_{P_i} + 1 for all i \ge 2.

The subtree of a vertex u, denoted S(u), is defined as the set of vertices v such that the unique path from 1 to v contains u. Also, we define $$\mathrm{val}_u = \max {h(v) : v\in S(u)}.$$

A tree is divided into paths by a ladder decomposition if each vertex is located on exactly one path, and each vertex u with at least one child lies in the same path as its child v with the maximum \mathrm{val}_v, and if there are several such vertices, then the vertex with the minimum number is selected.

Madoka defines the beauty of a tree in the following way. Let the ladder decomposition of the path lengths be L_1, \dots, L_k, then the beauty of the tree is L_1^2 + L_2^2 + \cdots + L_k^2

For each i (1\le i\le n), your task is to calculate the beauty of the tree formed by the first i vertices.


  • Process the nodes in increasing order, computing the increase in beauty after adding each node.
  • For each node i, the beauty increases by 2*(depth_i-depth_u)-1 if node u is the node with minimum depth such that there’s no node with label \lt i with depth depth_{i}.
  • To find such node u, we can store an ordered of nodes at each depth, ordered by their start time in euler tour, which allows us to compute P_u by considering only adjacentt nodes present in set.


Starting with one node, the tree has beauty zero. So if at each step, we can compute the change in beauty due to the addition of a new node, we would manage to solve the problem.

A different way to compute beauty

Let’s consider the following method to build ladder decomposition.

Repeat following steps until all nodes are not in ladder decomposition

  • Pick the lowest labeled node which is not yet included in any chain in ladder decomposition, say node u.
  • In subtree of node u, pick node v with the maximum depth. If there are multiple nodes with the maximum depth, pick the node with the minimum label.
  • Add (dep_v-dep_u)^2 to the beauty, and include chain of nodes from u to v in ladder decomposition.

We can see that the above approach is the same as the one in the problem statement, just framed differently, yet it shall help us proceed in editorial more intuitively.

A significant point to be noted here is that if multiple choices exist for node v (with maximum depth) in step 2, all choices yield the same beauty even though we chose the one with the minimum label.


This can be proved by the fact that relabeling nodes do not alter the beauty of the tree, so we can make labeling to make any node with the maximum depth be chosen as the deepest node v.


Let us assume we already have repeated the process for tree formed by first i nodes, resulting in beauty B_i, and we wish to calculate beauty B_{i+1} by adding node labeled i+1.

Observation 1: The algorithm above repeats in the same manner for the first i+1 nodes as for the first i nodes until node i+1 is picked as the deepest node in the second step for some nodes u chosen in the first step.

Proof: Node i+1 is a leaf node when considering first i+1 nodes. So it can be included in the chain only when it is the deepest node of that chain. Additionally, node i+1 has the maximum label in first i+1 nodes, so if node u has multiple nodes with maximum depth, including i+1, then the node with smaller label will be chosen, which is not node i+1

Following observation follows from the proof of first observation.

Observation 2: The node i+1 shall be picked as the deepest node for some node u, if and only if subtree of node u doesn’t contain any other node in its subtree with depth equal or greater than node i+1.

Now, let’s assume we are simulating the above algorithm for the first i nodes, and for the first i+1 nodes simultaneously, to observe their differences.

Let’s suppose node u is the first node, for which, the deepest node chosen is node i+1. Also assume for now that u \neq i+1.

Before this step, the two simulations worked exactly the same.

Now, when considering first i nodes, we would have chosen same node u, but the deepest node in it’s subtree shall have depth depth_{i+1}-1, since subtree of u contains node P_{i+1} \leq i having depth depth_{i+1}-1.

Hence, at this step, a chain of length depth_{i+1}-depth_u is formed in case of first i+1 nodes, instead of a chain of length depth_{i+1}-1-depth_u. The chain length has increased by 1.

After this step, the subsequent steps shall proceed in the same manner for both simulations.

Hence, the increase in beauty is (depth_{i+1}-depth_u)^2 - (depth_{i+1}-depth_u-1)^2. Additionally, it may happen that node u = i+1 in some cases. In this case, only a chain of length 0 is created, hence beauty does not change.

Hence, all we need to do is to find the first u node, which is an ancestor of node i+1, such that for node u, node i+1 is the only node with maximum depth, and node P_u contains at least one node with depth depth_{i+1} other than node i+1

Example Explanation

Consider first sample of problem statement.
Processing nodes from 2 to 6 as follows. Also, assume depth_1 = 1
For node 2, node u is node 1, so the increase in beauty is (depth_2-depth_1)^2-(depth_2-depth_1-1)^2 = 1^2-0^2 = 1. Overall beauty 1.
For node 3, node u is node 1, so the increase in beauty is (depth_3-depth_1)^2-(depth_3-depth_1-1)^2 = 2^2-1^2 = 3. Overall beauty 4.
For node 4, node u is node 4, so there’s no increase in beauty. Overall beauty 4.
For node 5, node u is node 4, so the increase in beauty is (depth_5-depth_4)^2-(depth_5-depth_4-1)^2 = 1^2-0^2 = 1. Overall beauty 5.
For node 6, node u is node 1, so the increase in beauty is (depth_6-depth_1)^2-(depth_6-depth_1-1)^2 = 3^2-2^2 = 5. Overall beauty 10.

Suboptimal approach

We know that node u is an ancestor of node i+1.

Also, we know that if we make list of ancestors of node i+1 in the order of depth from low to high, some prefix of nodes till P_u have a node in their subtree with depth depth_{i+1} other than node i+1, and all nodes in the suffix of that list have no node at depth depth_{i+1} except i+1.

Above is the required condition to apply binary lifting. Let’s denote par_{c, x} as the x-th ancestor of node c.

Hence, for node c = i+1, consider node par_{c, 2^b} in decreasing order of b (which is precomputed), check if subtree of par_{c, 2^b} contains at least one node with depth depth_{i+1} among first i nodes. If no, update c = par_{c, 2^b}.

In order to check, let’s do euler tour of tree to compute start time st_i and end time en_i for each node. To check efficiently, maintain a segment tree with point update range max query, and after processing a node i, update position st_i with value depth_i. That way, maximum of range of [st_x, en_x] returns the maximum depth of any node in subtree of node x.

We need to check O(log(N)) ancestors, and each check takes O(log(N)) time, so this solution works in O(N*log^2(N)) time, which we didn’t intend to pass last subtask, but still, with some optimizations, it may pass.

We shall look at a faster way of finding node u.

Optimal approach

Instead of finding u, let’s find P_u, and we can adjust the formula accordingly. If u = 1, we can actually add a ficticious node 0 as parent of node 1, so that P_1 = 0

We know that subtree of P_u contains a node with depth depth_{i+1} other than node i+1.

So there’s a node x in subtree of P_u contains a node at depth depth_{i+1} such that lca(x, i+1) = P_u.

So, let’s consider all nodes x at depth depth_{i+1} one by one and compute y = lca(x, i+1). We can prove that among all y, the node with maximum depth shall be P_u.

This provides us with a method to find P_u, but it is suboptimal, since there may be many nodes at depth depth_{i+1} with label \leq i.

To optimize this, let’s assume we have performed euler tour and computed start and end times.

Let us make a list of nodes with labels \leq i+1 having depth depth_{i+1}, and sort them by their start time. If node a is to immediate left of node i+1, then consider node x = lca(a, i+1). If node b is to the immediate right of node i+1 in this ordering, then consider y = lca(b, i+1).

We can see that P_u is the node with maximum depth among x and y.


If we have nodes in order a', a, i+1 when sorted by start time, then depth of node lca(a', i+1) shall not be higher than depth of node lca(a, i+1), by properties of euler tour of tree.

Hence, just find the nodes a and b at depth i+1, having their start time close to node i+1, compute lca(a, i+1) and lca(i+1, b), and pick the node with maximum depth.


Perform euler tour of tree, computing start and end times, and preprocess to compute lca queries either by binary lifting or RMQ. For each depth, maintain a set of pairs of form (st_x, x), containing nodes ordered by their start time.

Now processing nodes one by one, find nodes a and b using lower_bound and upper_bound queries, update beauty, and add pair (st_i, i) into set corresponding to depth_{i}

Additionally, you may refer to comments by setter here.


The time complexity is O(N*log(N)) per test case.


Setter's Solution
#include <iostream>
#include <vector>
#include <set>

using namespace std;

const int N = 1e6 + 228;
const int K = 20;

int up[N][K];
int tin[N], h[N];
int timer = 0;

vector<int> G[N];

void dfs(int v) {
    tin[v] = timer++;
    for (int i : G[v])

int la(int a, int level) {
    for (int i = K - 1; i >= 0; --i) {
        if (level >= (1 << i))
            level -= (1 << i), a = up[a][i];
    return a;

int lca(int a, int b) {
    if (h[a] > h[b])
        swap(a, b);
    b = la(b, h[b] - h[a]);
    if (a == b)
        return a;
    for (int i = K - 1; i >= 0; --i) {
        if (up[a][i] != up[b][i])
            a = up[a][i], b = up[b][i];
    return up[a][0];

void solve() {
    int n;
    cin >> n;
    timer = 0;
    for (int i = 0; i <= n; ++i)

    up[1][0] = 0, h[1] = 1;
    for (int i = 2; i <= n; ++i) {
        int a;
        cin >> a;
        up[i][0] = a, h[i] = h[a] + 1;

    for (int k = 1; k < K; ++k) {
        for (int i = 0; i <= n; ++i) {
            up[i][k] = up[up[i][k - 1]][k - 1];
    vector<set<pair<int, int>>> who(n + 1);
    for (int i = 0; i <= n; ++i) {
        who[i].emplace(-1, 0);
        who[i].emplace(n + 3, 0);

    cout << 0 << ' ';
    int cnt_leaf = 0;
    long long val = 0;
    vector<int> leaf(n + 1);
    for (int i = 2; i <= n; ++i) {
        auto z = who[h[i]].lower_bound(make_pair(tin[i], i));
        int v = lca(z->second, i), _v = lca((--z)->second, i);
        if (h[_v] > h[v])
            swap(_v, v);

        val += 2 * (h[i] - h[v] - 1);
        if (leaf[up[i][0]]) {
            leaf[up[i][0]] = 0;
        leaf[i] = 1;
        who[h[i]].emplace(tin[i], i);
        cout << val - i + cnt_leaf << ' ';
    cout << '\n';

int main() {
    int t;
    cin >> t;
    while (t--)
Tester's Solution
/* in the name of Anton */

  Compete against Yourself.
  Author - Aryan (@aryanc403)
  Atcoder library -

#ifdef ARYANC403
    #include <header.h>
    #pragma GCC optimize ("Ofast")
    #pragma GCC target ("sse,sse2,sse3,ssse3,sse4,popcnt,abm,mmx,avx")
    //#pragma GCC optimize ("-ffloat-store")
    #define dbg(args...) 42;

using namespace std;
#define fo(i,n)   for(i=0;i<(n);++i)
#define repA(i,j,n)   for(i=(j);i<=(n);++i)
#define repD(i,j,n)   for(i=(j);i>=(n);--i)
#define all(x) begin(x), end(x)
#define sz(x) ((lli)(x).size())
#define pb push_back
#define mp make_pair
#define X first
#define Y second
#define endl "\n"

typedef long long int lli;
typedef long double mytype;
typedef pair<lli,lli> ii;
typedef vector<ii> vii;
typedef vector<lli> vi;

const auto start_time = std::chrono::high_resolution_clock::now();
void aryanc403()
#ifdef ARYANC403
auto end_time = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> diff = end_time-start_time;
    cerr<<"Time Taken : "<<diff.count()<<"\n";

template<class Fun> class y_combinator_result {
    Fun fun_;
    template<class T> explicit y_combinator_result(T &&fun): fun_(std::forward<T>(fun)) {}
    template<class ...Args> decltype(auto) operator()(Args &&...args) { return fun_(std::ref(*this), std::forward<Args>(args)...); }
template<class Fun> decltype(auto) y_combinator(Fun &&fun) { return y_combinator_result<std::decay_t<Fun>>(std::forward<Fun>(fun)); }

long long readInt(long long l, long long r, char endd) {
    long long x=0;
    int cnt=0;
    int fi=-1;
    bool is_neg=false;
    while(true) {
        char g=getchar();
        if(g=='-') {
        if('0'<=g&&g<='9') {
            if(cnt==0) {
            assert(fi!=0 || cnt==1);
            assert(fi!=0 || is_neg==false);

            assert(!(cnt>19 || ( cnt==19 && fi>1) ));
        } else if(g==endd) {
            if(is_neg) {
            return x;
        } else {
string readString(int l, int r, char endd) {
    string ret="";
    int cnt=0;
    while(true) {
        char g=getchar();
        if(g==endd) {
    return ret;
long long readIntSp(long long l, long long r) {
    return readInt(l,r,' ');
long long readIntLn(long long l, long long r) {
    return readInt(l,r,'\n');
string readStringLn(int l, int r) {
    return readString(l,r,'\n');
string readStringSp(int l, int r) {
    return readString(l,r,' ');

void readEOF(){

vi readVectorInt(int n,lli l,lli r){
        return {};
    vi a(n);
    for(int i=0;i<n-1;++i)
    return a;


lli seed;
mt19937 rng(seed=chrono::steady_clock::now().time_since_epoch().count());
inline lli rnd(lli l=0,lli r=INF)
{return uniform_int_distribution<lli>(l,r)(rng);}

class CMP
bool operator()(ii a , ii b) //For min priority_queue .
{    return ! ( a.X < b.X || ( a.X==b.X && a.Y <= b.Y ));   }};

void add( map<lli,lli> &m, lli x,lli cnt=1)
    auto jt=m.find(x);
    if(jt==m.end())         m.insert({x,cnt});
    else                    jt->Y+=cnt;

void del( map<lli,lli> &m, lli x,lli cnt=1)
    auto jt=m.find(x);
    if(jt->Y<=cnt)            m.erase(jt);
    else                      jt->Y-=cnt;

bool cmp(const ii &a,const ii &b)
    return a.X<b.X||(a.X==b.X&&a.Y<b.Y);

const lli mod = 1000000007L;
// const lli maxN = 1000000007L;

void solve(const vi &pars){
    const lli n=sz(pars)+1;
    vector<vi> e(n);
    cout<<0<<" ";
    vi head(n,-1),size(n,0),child(n,-1);//chain info.
    vi height(n);
    lli ans=0;

    auto get=[&](lli x){
            return 0LL;
        return (x-1)*(x-1);

    auto changeHead = [&](const lli hd,lli u){

    function<void(lli)> recalibrate = [&](lli cur){
        const lli par=pars[cur-1],oth=child[par];
        const lli headPar = head[par];
        const lli curLen = height[cur] - height[headPar] + size[cur], pvrLen = size[headPar];

        // head[oth]=oth;



    for(lli i=1;i<n;++i){
        const lli par=pars[i-1];
        lli hd=i;

        cout<<ans<<" ";

int main(void) {
    // freopen("", "r", stdin);
    // freopen("txt.out", "w", stdout);
// cout<<std::fixed<<std::setprecision(35);
lli T=readIntLn(1,10);
lli sumN = 0, hmax = 0;

    const lli n=readIntLn(1,8e5);
    auto pars=readVectorInt(n-1,1,n);
    vi height(n,-1);
    for(lli i=1;i<n;++i){
        const lli par=pars[i-1];
    // assert(sumN<=2e3); //Subtask - 10 pts.
    // assert(sumN<=1e5); assert(hmax<=3); //Subtask - 20 pts.
    // assert(sumN<=2e5); //Subtask - 30 pts.
    assert(sumN<=8e5); //Full subtask.

    return 0;
Editorialist's Solution
import java.util.*;
class LADCOMP{
    void pre() throws Exception{}
    void solve(int TC) throws Exception{
        int N = ni();
        int[] P = new int[1+N], deg = new int[1+N];
        P[0] = -1;
        for(int i = 2; i<= N; i++)P[i] = ni();
        for(int i = 1; i<= N; i++)deg[P[i]]++;
        int[][] g = new int[1+N][];
        for(int i = 0; i<= N; i++){
            g[i] = new int[deg[i]];
            deg[i] = 0;
        for(int i = 1; i<= N; i++)g[P[i]][deg[P[i]]++] = i;
        LCA lca = new LCA(g);
        TreeSet<int[]>[] height = new TreeSet[1+N];
        for(int i = 0; i<= N; i++)height[i] = new TreeSet<>((int[] i1, int[] i2) ->[0], i2[0]));
        int[] depth = new int[1+N];
        long[] ans = new long[1+N];
        height[0].add(new int[]{[0], 0});
        long sum = 0;
        for(int i = 1; i<= N; i++){
            depth[i] = depth[P[i]]+1;
            int target = 0;//target should be the deepest ancestor of node i, such that node target has a node in its subtree with depth same as i
            int[] tuple = new int[]{[i], i};
            int[] prev = height[depth[i]].floor(tuple), next = height[depth[i]].ceiling(tuple);
            if(prev != null){
                int u = lca.lca(prev[1], i);
                if(depth[target] < depth[u])target = u;
            if(next != null){
                int v = lca.lca(i, next[1]);
                if(depth[target] < depth[v])target = v;
            long d = depth[i]-depth[target]-1;//length of chain formed
            if(target != P[i])
                sum += d*d-(d-1)*(d-1);
            ans[i] = sum;
        for(int i = 1; i<= N; i++)p(ans[i]+" ");
    class LCA{
        int n = 0, ti= -1;
        int[] eu, st, d;
        RMQ rmq;
        public LCA(int[][] g){
            n = g.length;
            eu = new int[2*n-1];st = new int[n];d = new int[n];
            Arrays.fill(st, -1);Arrays.fill(eu, -1);
            dfs(g, 0, -1);
            rmq = new RMQ(eu, d);
        public LCA(int[] eu, int[] fi, int[] d){
            this.n = eu.length;
   = eu;
   = fi;
            this.d = d;
            rmq = new RMQ(eu, d);
        void dfs(int[][] g, int u, int p){
            eu[++ti] = u;st[u] = ti;
            for(int v:g[u])if(v!=p){
                d[v] = d[u]+1;
                dfs(g, v, u);eu[++ti] = u;
        int lca(int u, int v){return rmq.query(Math.min(st[u], st[v]), Math.max(st[u], st[v]));}
        int dist(int u, int v){return d[u]+d[v]-2*d[lca(u,v)];}
        class RMQ{
            int[] len, d;
            int[][] rmq;
            public RMQ(int[] ar, int[] weight){
                len = new int[ar.length+1];
                this.d = weight;
                for(int i = 2; i<= ar.length; i++)len[i] = len[i>>1]+1;
                rmq = new int[len[ar.length]+1][ar.length];
                for(int i = 0; i< rmq.length; i++)
                    for(int j = 0; j< rmq[i].length; j++)
                        rmq[i][j] = -1;
                for(int i = 0; i< ar.length; i++)rmq[0][i] = ar[i];
                for(int b = 1; b<= len[ar.length]; b++)
                    for(int i = 0; i + (1<<b)-1< ar.length; i++)
                        if(weight[rmq[b-1][i]]<weight[rmq[b-1][i+(1<<(b-1))]])rmq[b][i] =rmq[b-1][i];
                        else rmq[b][i] = rmq[b-1][i+(1<<(b-1))];
            int query(int l, int r){
                if(l==r)return rmq[0][l];
                int b = len[r-l];
                if(d[rmq[b][l]]<d[rmq[b][r-(1<<b)]])return rmq[b][l];
                return rmq[b][r-(1<<b)];
    void hold(boolean b)throws Exception{if(!b)throw new Exception("Hold right there, Sparky!");}
    static boolean multipleTC = true;
    FastReader in;PrintWriter out;
    void run() throws Exception{
        in = new FastReader();
        out = new PrintWriter(System.out);
        //Solution Credits: Taranpreet Singh
        int T = (multipleTC)?ni():1;
        pre();for(int t = 1; t<= T; t++)solve(t);
    public static void main(String[] args) throws Exception{
//        new LADCOMP().run();
        new Thread(null, new Runnable() {public void run(){try{new LADCOMP().run();}catch(Exception e){e.printStackTrace();System.exit(1);}}}, "1", 1 << 26).start();
    int bit(long n){return (n==0)?0:(1+bit(n&(n-1)));}
    void p(Object o){out.print(o);}
    void pn(Object o){out.println(o);}
    void pni(Object o){out.println(o);out.flush();}
    String n()throws Exception{return;}
    String nln()throws Exception{return in.nextLine();}
    int ni()throws Exception{return Integer.parseInt(;}
    long nl()throws Exception{return Long.parseLong(;}
    double nd()throws Exception{return Double.parseDouble(;}

    class FastReader{
        BufferedReader br;
        StringTokenizer st;
        public FastReader(){
            br = new BufferedReader(new InputStreamReader(;

        public FastReader(String s) throws Exception{
            br = new BufferedReader(new FileReader(s));

        String next() throws Exception{
            while (st == null || !st.hasMoreElements()){
                    st = new StringTokenizer(br.readLine());
                }catch (IOException  e){
                    throw new Exception(e.toString());
            return st.nextToken();

        String nextLine() throws Exception{
            String str = "";
                str = br.readLine();
            }catch (IOException e){
                throw new Exception(e.toString());
            return str;

Feel free to share your approach. Suggestions are welcomed as always. :slight_smile:

1 Like