I recently started reading *The Art of Computer Programming* and everything was going smooth until I reached the end of the first section which describes the formulation of algorithms with an additional restriction that they should be effective. Below is an excerpt from the book:

If we wish to restrict the notion of algorithm so that only elementary operations are involved, we can place restrictions on Q,\ I,\ \Omega,\ and\ f, for example as follows: Let A be a finite set of letters, and let A^* be the set of all strings on A (the set of all ordered sequences x_1x_2 \ldots x_n, where n \ge 0 and x_j is in A for 1\le j \le n). The idea is to encode the states of the computation so that they are represented by strings of A^*. Now let N be a non-negative integer and let Q be the set of all (\sigma, j), where \sigma is in A^* and j is an integer, 0 \le j \le N; let I be the subset of Q with j = 0 and let \Omega be the subset with j = N. If \theta and \sigma are strings in A^*, we say that \theta occurs in \sigma if \sigma has the form \alpha \theta \omega for strings \alpha and \omega. To complete our definition, let f be a function of the following type, defined by the strings \theta _j ,\ \phi _j and the integers a_j,\ b_j for 0 \le j < N:

1)\ f((\sigma, j)) = (\sigma, a_j), if \theta _j does not occur in \sigma;

2)\ f((\sigma, j)) = (\alpha \phi \omega, b_j), if \alpha is the shortest possible string for which \sigma = \alpha \theta _j \omega;

3)\ f((\sigma, N) = (\sigma, N)

Every step of such a computational method is clearly effective, and experience shows that pattern-matching rules of this kind are also powerful enough to do anything we can do by hand. There are many other essentially equivalent ways to formulate the concept of an effective computational method (for example, using Turing machines). The formulation above is virtually the same as that given by A. A. Markov in his bookThe Theory of Algorithms.

Can anyone help me understanding with the concept of effectiveness and this description? Also, I wanted to know how it relates to Markov algorithms. Thank you