need help to improve the time complexity

this ques ( from IIIT Hyderabad Contest - code lags in the second part of the ques where we need to traverse a sub tree rooted on a given node, for this i perform a bfs, (at the time of input, i use the container vector< pair< llu, set > > with first part storing the distance from root and second part containing the children of that node)
for first part i just maintain a 1D vector which stores and updates coins at each level.
for second part i perform bfs using the the container, but the problem is in worst case i.e. with all nodes being immediate children of the root, complexity of bfs goes up to O(M*N), M = no. of queries = 10^4, N = no. of nodes = 10^5.
hence my code gives TLE at test 2.
so my question is : to analyse all the nodes rooted at a node x we are bound to visit all of them, which implies we cant improve N, and M is a constraint, so can my algo be made to pass TLE with small improvements, or do i have to srap the idea and start afresh?

my code :