Given a weighted, undirected graph G, (N nodes, M edges) and a string S of length X. Each node in G has a letter attached to it. Find the minimum cost of a walk in G such that the path contains S as a subsequence.
- 1 \leq N,X \leq 10^3
- 0 \leq M \leq 10^3
- No two consecutive characters of string S are same.
SUPER QUICK EXPLANATION:
Run a multi-source dijkstra from every node i, such that letter=s. Use DP[x] = minimum cost to reach node i after covering the first x characters of the string. The answer is min(DP[|S|-1]). Impossible cases occur when the graph is not well-connected and character is not present at all.
We can start our walk from any vertex. So, obviously we’ll start from a vertex i such that letter=s. Let’s call each of these vertices i a source. As we can start from any of these sources, the cost to reach them is 0. When we start from any source, we have already covered the first letter of the string at cost 0.
We eventually want to end our walk at a vertex i such that letter=s[|s|-1]. The cost for this walk has to be minimum possible, while having visited all previous characters of the string as a subsequence. We could have computed the same for all vertices s[|s|-2] and then extended our walk to reach a vertex s[|s|-1], such that the cost of the walk to reach that vertex s[|s|-1]] is minimum possible. Thus, all vertices which have the character s[|s|-2] become a new source after having covered all characters till s[|s|-2] in their walks. Same can be done for s[|s|-3] and so on.
Let’s define cost[x] as the minimum cost (as a walk) to reach vertex i after having covered the first x characters of the string as a subsequence. The final answer we are looking for will be minimum of all cost[x] such that letter=s[|s|-1]. Impossible cases shall occur when either the graph doesn’t contain a particular character present in the string, or the graph isn’t well connected enough to reach all characters of the string in a walk. If we find that we can reach no vertex i for a state s[x] after having covered all the previous characters, then it’s an impossible case.
Thus, we get some kind of greedy algorithm, where for every state x, we have to make locally-optimal choices to get the optimal solution for that state. We can do this by running a multi-source Dijsktra’s algorithm, where, initially, the sources are all vertices i such that letter=s. We run Dijkstra on them for the state x=1, cost=0, signifying that it took us cost 0 to reach them, while having covered the first character of the string. All other nodes have cost = INF initially for all states. While exploring a node i, for each of its neighbour j, we update the cost for j for the state x if it costs us less to reach j for that state from i. If j has the (x+1)th character of the string, we update its cost for state (x+1) instead, as on walking through that node, we are also covering the (x+1)th character of the string. If either of this reduces the cost to reach j for the respective states, we add j for that state to the Dijkstra queue. Thus, j for that state becomes a new source to run Dijkstra from. You can read about Dijsktra’s Algorithm here and here. I found them helpful.
We have X destinations in worst case. For each of them Multi-source dijkstra takes O(MlogN). The cost array takes O(N*X). Thus,
- Time Complexity: X*(M+N)logN
- Space Complexity: O(N*X)
1) Multi-source Dijkstra and a little on the implementation:
Click to view
In classical Dijkstra, we have a single source. We need to find the cost to reach every other node from this source. We assign the distance to reach the source 0 and put it in a priority queue. As we keep exploring the neighbours, we apply distance relaxation on them and if we find a shorter distance to reach them, we update it and add it again to the priority queue with the updated distance. When popping from the queue, if this was the older version of an-already popped node, we ignore this, otherwise we explore the neighbours of this node.
In Multi-source Dijkstra, instead of having a single source, we have multiple sources, that is, we can start our walk from any source. We are only concerned about reaching the destination with minimum cost, irrespective of which source node we started from. The cost to start from each of these sources is 0. Thus, we assign distance 0 to all these sources and put them in the priority queue to explore their neighbours. After all these sources are explored, their neighbours will be treated as the new sources and the neighbour with the minimum cost will be popped out first. This is different from All-pairs shortest path algorithm because here we have only a single destination to reach with minimum cost from multiple possible starting positions, as compared to reaching every possible destination from every node. In all-pairs, we run a disjoint dijkstra from every node, while here, we run 1 dijkstra with the source vertices having distance 0. We could have solved this problem using all-pairs dijkstra with dynamic programming as has been done here, but we run into some TLE issues. The theoretical complexity of all-pairs shortest path problem is N*MlogN while that of multi-source dijkstra is MlogN (for a single destination).
Here, we had multiple destinations too, so we implemented this using a structure having three parameters-
- i The node we are currently at,
- x The number of characters of the string from the start that we have covered, and
- d The cost it incurred.
This structure gave the minimum cost to reach node i after covering the first x letters in the string. Initially, we inserted all nodes which have letter=s* into the priority queue, setting their d to 0 and x to 1, signifying that we have covered 1 character of the string. The queue is prioritised according to d. We run multi-source dijkstra for each destination. As soon as we have covered x characters in the string, we exit from the queue as this must have been the shortest distance to cover x characters.
Feel free to share your approach if it differs. If you have any doubts, or if you feel stuck at any point, you can ask them below. We would love to hear your suggestions