News

We propose for risk-sensitive control of finite Markov chains a counterpart of the popular Q-learning algorithm for classical Markov decision processes. The algorithm is shown to converge with ...