POINTCN - Editorial

challenge
editorial
hill-climbing
mst
oct17
pointcn

#1

PROBLEM LINK

Practice

Contest

Author: Hasan Jaddouh

Tester: Alexey Zayakin

Editorialist: Jakub Safin

DIFFICULTY

CHALLENGE

PREREQUISITIES

none

PROBLEM

You’re given two N imes N matrices C and D. Construct a connected undirected graph on N vertices while minimising its total cost. The cost of a graph with M edges (u_1,v_1),\dots,(u_M,v_M) and degrees of vertices d_1,\dots,d_N is

\sum_{i=1}^M C_{u_i,v_i} + \sum_{i=1}^N D_{i,d_i}\,.

Note that the vertex costs aren’t necessarily increasing with increasing degrees. The test data are generated randomly.

QUICK EXPLANATION

What, you want a quick explanation for an optimisation problem?

Find a decent solution, then try adding/removing edges and hillclimbing. A decent solution is e.g. the minimum spanning tree or a graph with optimal degrees.

EXPLANATION

First, let’s list some statistical results.

  • the expected sum of N random integers, each in the range [0,M], is \mu=NM/2

  • Central Limit Theorem tells us that for large enough N, the standard deviation of this sum is \sigma=\frac{\sqrt{N}M}{\sqrt{12}}=\frac{\mu}{\sqrt{3N}}

  • the expected value of a minimum of N random numbers is M/(N+1); its standard deviation is \sigma=\mu\sqrt\frac{N}{N+2}

One thing we can do easily is take a random solution – with M edges and N vertices, the cost is ~N\frac{H_{max}}{2}+M\frac{C_{max}}{2} – and then try improving it by adding/removing edges (while keeping connectivity) and recalculating the cost; everytime if we find a better solution, we can keep improving that. Each improvement attempt runs in O(N+M), but the limiting factor is just checking if it’s still connected, so it can be done in O(1) per failed attempt e.g. if the graph is a spanning tree (M=N-1).

It’s usually a good idea to take less edges instead of many. We can also try improving a different solution than a random one.

Minimum spanning tree

Note that if the second sum didn’t exist, we’ve got a problem that’s well-known and easy to solve: computing the MST of a complete graph. This can be done in O(N^2) using a modification of Jarnik-Prim algorithm which uses a list instead of a heap. Turns out the expected cost of this tree is ~1.2C_{max}.

Of course, even if we find an MST, the second sum will give us N random costs, so the expected costs of this approach is ~N\frac{H_{max}}{2}+1.2C_{max}.

Optimal degrees

How about optimising the second sum? We can just pick the minimum H_{ij} for each i, but that doesn’t guarantee us if there even is a connected graph with these degrees. Regardless, the expected minimum is ~H_{max}/N, so their expected sum is ~H_{max} (instead of H_{max}N/2 for random choices).

The required conditions for a connected graph with degrees d_{1..N} are:

  • the sum of degrees must be even (the so-called handshake lemma)

  • \sum d_i \ge 2(N-1)

  • no vertex has degree 0

This works because we can just keep making edges between 2 vertices with the currently maximal d_i and decrementing those d_i, which gives some graph with \ge N-1 vertices; then we can keep removing a non-bridge edge from some component and using it to connect 2 components until the graph is connected.

It might turn out that the degree sequence which gives the minimum cost from H isn’t feasible, but non-zero degrees are trivial, the expected \sum d_i is ~N^2/2 (the indices j which give minima are random) so that condition shouldn’t be a problem and if the parity doesn’t fit, it can be fixed by changing just one degree. All in all, the cost of the second sum can be ~H_{max}.

We still need to choose edges to fit the degrees. This is good, because it allows us to minimise the cost of the first sum as far as possible and get something better than ~\frac{C_{max}}{2}\frac{N^2}{4} (there are \sum d_i/2 edges). After we choose the edges, we can still run the hill-climbing / random optimisation.

Something else

We have described two approaches which try to minimise one sum at the cost of keeping the other pretty large. Something in between would be even better – picking degrees so that the cost from H wouldn’t be toolarge and the number of edges wouldn’t be too large either, choosing worse edges to get better degrees… it also depends on the specific values of C_{max} and H_{max}.

What did you use?

AUTHOR’S AND TESTER’S SOLUTIONS

Setter’s solution

Tester’s solution


#2

Setter’s solution is the default template.


#3

vital part of problem is Minimum spanning tree.after that we have a network that all connected to each other … now we can use add or remove(must remain connected!) some cable optimally for get best point…in this part we can use many approach for connect office optimally …