Percentile objective criteria in limiting average Markov Control Problems
Loading...
Date
1989
Authors
Filar, Jerzy A
Krass, Dmitry
Ross, Keith W
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronic Engineers
Rights
Rights Holder
Abstract
Infinite horizon Markov Control Problems, or Markov Decision
Processes (MDP's, for short), have been extensively studied since
the 1950's. One of the most commonly considered versions is
the so-called "limiting average reward" model. In this model
the controller aims to maximize the expected value of the limit-average
("long-run average") of an infinite stream of single-stage
rewards or outputs. There are now a number of good algorithms
for computing optimal deterministic policies in the limiting average
MDP's. In this paper we adopt the point of view that there are
many natural situations where the controller is interested in finding
a policy that will achieve a sufficiently high long-run average
reward, that is, a target level with a sufficiently high probability,
that is, a percentile.
Description
Keywords
Mathematics, Markov Decision Processes
Citation
Filar, J.A., Krass, D. and Ross, K.W., 1989. Percentile objective criteria in limiting average Markov Control Problems. Proceedings of the 28th IEEE Conference on Decision and Control, vol. 2, 1273-1276.