My solution is a bit different than the one described in editorial and has a time complexity
O ((W + P) 2^k) for the third kind of queries.
In order to check whether a sum S is possible, we compute the number of ways T(S) of obtaining that sum. The sum S is possible iff T(S) \ne 0.
Let us say we have coins of denomination x1, x2, \ldots, xk.
and queries are of the form (S, r1, r2, \ldots, rk), i.e., is it possible to make a sum S, using at most r1 coins of the first type, r2 coins of the second type, and so on (S \le W).
For this set of denomination, we pre-compute an array P[1 \ldots W], such that P[x] represents the number of ways of obtaining the sum x, with no restriction on the number of coins. For the time being let us assume that we already have this array, we will explain later how to compute this array.
Now the number of ways of obtaining the sum S with the restrictions on the number of coins can be computed using inclusion-exclusion principle.
T(S, r1, r2, \ldots, rk) = P[S] - Q(S, r1, r2, \ldots, rk),
where Q(S, r1, r2, \ldots, rk) is the number of ways in which at least one of coins is used more than specified bound.
Q(S, r1, r2, …, rk)
= \sum (-1)^{u + 1} \text{number of ways of obtaining } S \text{ such that coins } i1, i2, \ldots, iu \text{ violate the bound.}
= \sum (-1)^{u + 1} P[S - (r_{i1} + 1) x_{i1} - (r_{i2} + 1) x_{i2} - \ldots - (r_{iu} + 1) x_{iu}]
Hence, T(S, r1, r2, \ldots, rk) can be computed in O(2^k) time using the P array.
Now let us see, how to compute the array P.
If there were a single coin denomination x, then we have computed this array quite easily.
P[i] = 1, if i is divisible by x, and 0 otherwise,
Suppose we already have such an array Q for the coins \{x2, x3, \ldots, xk\}, and we want to use it to compute array P for the coin set \{x1, x2, \ldots, xk\}
P[i] = Q[i] + Q[i - x1] + Q[i - 2x1] + \ldots
Note that,
P[i] = Q[i] + P[i - x1]
Hence, array P can be computed in O (W) time using the array Q.
We have k denominations, and we would want to compute the pre-computation array for each subset of denominations, hence the pre-computation can be done in O (W2^k) time.
Alternatively, we can always use the denomination set which consists of all k types, and add the bounds (r_i = 0), for denominations which are missing in the set. This means we only need to pre-compute the array for the coin set consisting of all denominations, and can be done in O (Wk)
Also note that, we need to do all the computation modulo some prime (I chose 10^4 + 7, mainly to reduce the number of modulo operations)
Of course, this means that the constraint P \le 1000, can be relaxed, and we can probably allow up to 10^5 queries. On the other hand, this approach has exponential complexity in terms of k, so even if we increase the value of k slightly (say to 15), this approach will fail.