We consider a discrete time Markov Decision Process with infinite horizon. The criterion to be maximized is the sum of a number of standard discounted rewards, each with a different discount factor.
Sam Adeyemi is CEO of Sam Adeyemi GLC Inc., a global leadership consultancy with a vision to raise high-impact leaders. Creativity isn’t just creation. In fact, one of the most underrated strengths of ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results