 # TOURBUS - Editorial

Author: David Stolp
Tester: Gerald Agapov
Editorialist: Jingbo Shang

Challenge

### PREREQUISITES:

Greedy, Dynamic Programming, Search, Random

### PROBLEM:

Given a undirected graph with N nodes and M edges, while each node is associated with a 2D coordinate, find minimum number of simple paths or cycles, which are not self-intersected and covered each edge exactly once.

### EXPLANATION:

This is a challenge problem. The most important thing in solving challenge problem is to find some approximation/partial methods using the combination of search, greedy, dynamic programming, etc…, because this challenge problem is NP-hard. Also, it is important to write a local test system to help you find the best strategy (almost all top ranked guys wrote their own test codes).

To solve this problem, first of all, we need to know why there is a restriction of the length of answer – should be smaller than or equal to (N+M)/2. That is because we can connect two edges share a common vertex as a path, which is definitely not self-intersected. Therefore, we can achieve the restriction of output.

And then, considering the special cases with smaller size. For example, if M is smaller than 15, we can use dynamic programming to get the perfect answer. First of all, generate all valid paths and cycles. They are distinguished by its set of edges (represented in a binary mask, 0…2^M). And then, for a given subset S of edges, we enumerate all its subset check whether it is a possible path/cycle and update the optimal answer accordingly. More specifically, denote the minimum valid paths/cycles needed to exactly cover the edge set S as f[S]. The DP procedure is O(3^M) after we obtain all valid paths/cycles as following:

``````f = 0; // it is an empty set
for S = 1 to 2^M - 1 {
f[S] = infinite
for (sub = S; sub > 0; sub = S & (sub - 1)) {
if (sub is a valid path/cycle) {
f[S] = min{ f[sub] + 1, f[S] }
}
}
}
``````

The trick is to enumerate the sub via binary operations instead of 0…2^M.

For the general cases, i.e. large N and M, we can find some greed ways. For example, intuitively, we may need to find out the longest valid path/cycle, remove them, and go on. Most of the top solutions during contest apply this idea or similar ideas. But, even finding the longest valid path/cycle is NP-hard in undirected graph. Here, we introduce two ways to find the longest path. The framework is as following:

``````while there is any edge remained in the graph {
longest = the longest valid path/cycle in the graph
remove longest from the graph
}
``````

For both methods, we need to find some heuristic functions to rank the next edge to add to the current path. The most intuitive function is based on the intersections of other edges. It reflects how an edge conflicts with others, which means those edges will be invalid after we select this edge. Therefore, we can rank the edges in order. For the ties of edges, we can break them randomly.

The first method is Search. It takes a starting edge, and use dfs to expand the path as long as possible. The second method is Greedy, choose the best edges in each step. For Search, we also need to add constrains to the number of expanded solutions or the time it costs. Search may give better solution while Greedy is much faster. This is a trade-off.

In the end, I would like to give you 3 more tips:

1. Due to the randomness of the order of edges, it will be better to run the algorithm multiple times to get a better score.
2. Use different heuristic functions for different type of data or for different runs of the algorithm, may also benefit to your score.
3. Implement your codes as fast as possible, such that it could be run more times in the time limit.

Any more awesome ideas are welcome to share in the replies. ### AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution can be found here.
Tester’s solution can be found here.

1 Like

The Editorial gives perfect solution for tests with M<=15. Do you really belive such tests are present in official test cases? I mean N>20 and P>0,3. So the expected number of edges is at least about 60.

1 Like

Would anyone be kind enough to share their test case generator / verifier now that the contest is over ? My code was returning WA with some parameters and I wanted to investigate them further.

It’d save me some time having to write my own … sigh.

Can any of the top scorers or others explain how to use some of the other methods like using intersection between the segments or dfs method etc…

P.S.:

I have done it using the following method:

1. find degree of each vertex.
2. Sort the vertices in decreasing order of degrees.
3. choose the first vertex.
4. If it has degree >=2, then make a 2 length path with this vertex as center and then extend it both ways(choosing the first positive degree vertex to add to the end and the repeat the same with the newly added end).
5. If it has degree 1, then make a 1 length tour
6. decreasing the corresponding degrees and also deleting the corresponding edges from the graph at each step of adding an edge.
7. Go to step 2 and repeat the same until there are more edges.

This method gave me a score of 0.447
(suggestions to the method i used are also welcomed)

i hv used dfs aproach
but i dont know y its giving wa
pls can any tell me

1 first select vertex with min degree
2 from it start dfs till any vertex is left (excluding intersecting n repeated ones) n deleting these edges frm grph
3 repaeting 1 n 2 till min degree=0

My solution mostly corresponds to what explained in the second part of the editorial.
But instead of greedy or recursive search I use beam search with width equal to 1000 / N (dogscience). For cost of the edge I used euclidean and Manhattan distances - the greater the distance the better the edge. Don’t ask me why - this is dog science as well As mentioned in editorial I run the tours search multiple times while time permits. To speed up the process I stop the particular run if (the number of paths in it) + (some basic estimate of the number of paths in the remaining graph) is >= (the current best solution). For basic estimate I take (max_degree + 1) / 2. The better one would be the sum of such values over all connected components. But it’s time-expensive and give worse score.

Another small difference is that instead of taking some fixed edge I take some vertex and build the longest path starting from it. Also once the path is not extendable I try to extend it in opposite direction. My tours search has two modes. Random one - where at each step we take any vertex of positive degree and greedy one - where at each step we take the vertex of maximum degree.

Since number of vertexes is below 64 I use bit masks technique to store the graph.

See details in my solution. I cleaned up all testing stuff for readability. It scores 0.741 points.

Regarding the testing stuff I want to mention that test generator here could take long time to generate tests with N close to 50, due to collinearity condition. So instead of generating them in each run I generated random 50 tests once and save them into file.

2 Likes

This is a very basic generator. It does not check for collinearity though it is unprobable. Also value of n has to be changed manually.

1 Like

This perfect algorithm can be applied when some longest paths/cycles have been removed from the original graph.

1 Like

I think you can look at some top solutions, they all have generator and verifier that you need. Also, it is not hard to implement it. I think implementing them now will be helpful for the next month’s contest. 1 Like

I used a similar approach to yours but with these diff.

1. Instead of cities, I used roads (represented a road between i, j as i*N + j where i < j) and determined connecting roads.

2. When choosing a road to extend both ways for a tour, i try it with ‘kLimit’ topmost nodes (like yours, in the order of their max degrees).

3. After extending, I get kLimit different tours at this level - I try the 2 longest and attempt the rest of the solution with them (all the rest tours)

4. The best of these two is chosen - however to save time, this check for 2 is only done with top ‘kThreshold’ level

I tweaked values for kLimit and kThreshold to get more optimal results … but unfortunately I didn’t do much good either .607.

Thank you, i’ll look into it.

A random test case generator shouldn’t be hard, but one which ensures points are collinear, etc. And the verifier which ensures all the roads have been visited, and no road intersects would take a bit of an effort. I’m prepared to write one myself, but would’ve saved me a lot of effort if someone who already had done it for the contest, shared it with me.

whats this “kThreshold” ?

Basically after you get the first tour (the set of connecting roads), you’ll have to run the same algorithm again on the remaining roads till you get tours with all of them covered.

I am only evaluating the best of 2 tours for first ‘kThreshold’ tours. For the remaining tours, the best solution is automatically selected.

My reasoning was, that at that point, you’ll have more discontinous sets …

We can check these conditions and re-generate if some are violated.

Example psuedocode

```FindTours (tour_no) :

repeat for kLimit times
expand it to both sides
save the tour
end

if (tour_no > kThreshold)
select best tour
FindTour(tour_no + 1)
return solution
end

repeat for 2 best tours
select tour
FindTour(tour_no + 1)
save finalSolution
end

return with best out of two finalSolution
end
```

You can use my generator: http://ideone.com/89oalv

1 Like

a good variant of my approach. the advantage you have with this method is there are more variables to play with.

i hv used dfs aproach
but i dont know y its giving wa
pls can any tell me

1 first select vertex with min degree
2 from it start dfs till any vertex is left (excluding intersecting n repeated ones) n deleting these edges frm grph
3 repaeting 1 n 2 till min degree=0