The system considered may be in one of [n] states at any point in time; its probability law is a Markov process that depends on the policy (control) chosen. The return to the system over a given planning horizon is the integral over that horizon of ...