In reinforcement learning, the atmosphere is usually represented being a Markov decision process (MDP). Numerous reinforcements learning algorithms use dynamic programming procedures.[fifty three] Reinforcement learning algorithms tend not to believe expertise in an exact mathematical design in the MDP and so are made use of when actual versions ar