Factbites
 Where results make sense
About us   |   Why use us?   |   Reviews   |   PR   |   Contact us  

Topic: Markov decision process


  
  Markov chain - Wikipedia, the free encyclopedia
Markov chains are related to Brownian motion and the ergodic hypothesis, two topics in physics which were important in the early years of the twentieth century, but Markov appears to have pursued this out of a mathematical motivation, namely the extension of the law of large numbers to dependent events.
Markov chains also have many applications in biological modelling, particularly population processes, which are useful in modelling processes that are (at least) analogous to biological populations.
Markov processes can also be used to generate superficially "real-looking" text given a sample document: they are used in various pieces of recreational "parody generator" software (see dissociated press, Jeff Harrison, Mark V Shaney or [1]).
en.wikipedia.org /wiki/Markov_chain   (1800 words)

  
 Journal of the Brazilian Computer Society - Simulation of controlled queuing systems and its application to optimal ...   (Site not responding. Last check: 2007-08-17)
A Markov decision process is a controlled stochastic process satisfying the Markov property with costs assigned to state transitions.
In general, a solution to a Markov decision problem is a policy, mapping states to actions, that determines state transitions to minimize the cost according to the performance criterion [1].
Whenever an event is extracted from the head of EQ and processed by EH, the next event of the same type is scheduled immediately according to the given probability dis-tribution using random number generator and is put into the tail of EQ.
www.scielo.br /scielo.php?script=sci_arttext&pid=S0104-65001999000100005&lng=pt&nrm=iso   (4325 words)

  
 Conservation Ecology: Protocol and Practice in the Adaptive Management of Waterfowl Harvests
A generalization of the Markov decision process permits the calculation of optimal actively adaptive policies, but it is not yet clear how state-specific harvest actions differ between passive and active approaches.
Subsequent improvements in the regulatory process were framed in terms of adaptive resource management, in which there is an explicit accounting for uncertainty and for the influence of management in reducing that uncertainty (Williams and Johnson 1995, click here for an on-line copy of this reference).
The process is passively adaptive in the sense that informative changes in model probabilities occur as an unplanned by-product of the regulatory process (Walters 1986).
sunsite.wits.ac.za /eco/vol3/iss1/art8   (5494 words)

  
 Markov Decision Process (MDP) Toolbox for Matlab
A Markov Decision Process (MDP) is just like a Markov Chain, except the transition matrix depends on the action taken by the decision maker (agent) at each time step.
MDPs assume that the complete state of the world is visible to the agent.
Unfortunately, the observations are not Markov (because two different states might look the same), which invalidates all of the MDP solution techniques.
www.cs.ubc.ca /~murphyk/Software/MDP/mdp.html   (1680 words)

  
 Markov Decision Process Editor   (Site not responding. Last check: 2007-08-17)
In the Markov Process on the left, the probability of moving from the start state (S) to state A is 0.7 while the probability of moving from S to B is 0.3.
MDPs are used to describe all sorts of processes.
Knowledge of Markov Decision Processes is not required (except for some of the extensions).
www.cs.bris.ac.uk /~kovacs/student.projects/mdp.editor.html   (504 words)

  
 Rich Sutton's Publications
In this paper we present a framework based on Markov decision processes and semi-Markov decision processes for phrasing this problem, a basic theorem regarding the improvement in performance that can be obtained by switching flexibly between given controllers, and example applications of the theorem.
Sutton, R.S. On the significance of Markov decision processes.
MDPs may be becoming a common focal point for different approaches to understanding the mind.
web.cs.ualberta.ca /~sutton/publications.html   (12969 words)

  
 [No title]   (Site not responding. Last check: 2007-08-17)
A “perfectly flexible” policy is defined as instantly matching process capabilities to changes in market preferences, while a “robust” policy is defined as selecting and employing only a single invariant process technology.
Simulation experiments and numerical examples demonstrate when a flexible process strategy is preferred to a robust strategy, and vice versa.
With an objective of profit maximization, we show that when the cost of switching production processes is very high, the optimal policy is to select a single robust process and to never switch from it.
leeds-faculty.colorado.edu /lawrence/research/mapps   (413 words)

  
 Markov Decision Process Toolbox for MATLAB   (Site not responding. Last check: 2007-08-17)
Markov Decision Process (MDP) Toolbox v2.0 for MATLAB
The MDP toolbox proposes functions related to the resolution of discrete-time Markov Decision Process : finite horizon, value iteration, policy iteration, linear programming algorithms with some variants.
The different functions of the MDP toolbox for MATLAB are listed in one page.
www.inra.fr /bia/T/MDPtoolbox   (172 words)

  
 POMDP information page
A POMDP is a partially observable Markov decision process.
Markov model talk: The slides for a talk by Michael Littman on the use of Markov models (MDPs, POMDPs, and Markov games) in AI.
The talk was presented at the University of Pennsylvania, Brandeis University, and the University at Stony Brook during the spring of 1995.
www.cs.duke.edu /~mlittman/topics/pomdp-page.html   (764 words)

  
 IPDPM Markov decision processes   (Site not responding. Last check: 2007-08-17)
An aggregate Markov Decision Process model that defines the state, actions, and randomness associated with fab-level decision making; and methods that develop optimal policies for fab-level decision making.
The planning and scheduling of a semiconductor fab is carried out according to a general hierarchical framework based on a temporal and/or physical decomposition of the system.
Markov Decision Process (MDP) methodology will be used to develop a finite-horizon transient model directed towards a total life cycle decision-support model, incorporating more detailed, but very practically significant, issues than the models found in the current research literature.
www.isr.umd.edu /Labs/Virtual/IPDPM/catalog3.htm   (276 words)

  
 Reliability-Based Structural Design with Markov Decision Processes   (Site not responding. Last check: 2007-08-17)
Using a Markov decision process (MDP) model and structural reliability theory, a designer at the initial design stage is able to incorporate a reliability-based model of the lifetime process of the structure.
The advantage of MDP is that it systematically characterizes the entire process, including decisions, costs, and system performance.
For an existing structure, the approach gives a decision maker a future maintenance policy leading to identification of the minimum discounted expected future cost of the structure, based on its present condition, and maintains acceptable reliability.
www.pubs.asce.org /WWWdisplay.cgi?9502215   (207 words)

  
 3.6 Markov Decision Processes
A particular finite MDP is defined by its state and action sets and by the one-step dynamics of the environment.
This system is then a finite MDP, and we can write down the transition probabilities and the expected rewards, as in Table 3.1.
Assuming a finite MDP with a finite number of reward values, write an equation for the transition probabilities and the expected rewards in terms of the joint conditional distribution in (3.5).
www.cs.ualberta.ca /~sutton/book/3/node7.html   (824 words)

  
 Experience-Based Model Predictive Control Using Reinforcement Learning - Rudy Negenborn, Bart De Schutter, Marco ...   (Site not responding. Last check: 2007-08-17)
In this paper we propose the use of MPC to control systems that can be described as Markov decision processes.
We discuss how a straightforward MPC algorithm for Markov decision processes can be implemented, and how it can be improved in terms of speed and decision quality by considering value functions.
The proposed approach can be beneficial for any system that can be modeled as Markov decision process, including systems found in areas like logistics, traffic control, and vehicle automation.
www.negenborn.net /pubs/model_predictive_control/model_predictive_control.htm   (265 words)

  
 Markov Decision Process terrain generator
I hope you are familiar with games like Starcraft, Warcraft, Civilization, etc. This is a small python script that will create the terrains similar to the ones used in those kinds of games.
The way a MDP terrain generator works is that you define a probability matrix.
For example, if you are near water - there is 60% probability that your current cell is water as well, 30% that it is shore and 0.001% that is a mountain peak.
www.ee.ualberta.ca /~grzegorz/MDP.html   (206 words)

  
 Brian's Digest: Markovian Decision Processes   (Site not responding. Last check: 2007-08-17)
I am looking for a good book for an introduction to the hidden markov models and especially their application in statistical problems.
Your MDP is a special case of a partially observed MDP.
I am looking for references to recent work on approximation methods for partially observable markov decision processes.
www.worms.ms.unimelb.edu.au /digest/markovian_dp.html   (399 words)

  
 Multiscale Analysis of Markov Decision Processes
First of all they can efficiently encode functions on general state spaces for these processes, thus performing a dimensionality reduction task which is recognized as fundamental in order to be able to perform efficient computation of these processes.
Secondly there many connections between Markov Decision Processes and harmonic analysis of Markov Chains, stemming from the connections between the Bellman equation and the Green's function or fundamental matrix of a Markov Chain.
Policy evaluation is a critical step in the approximate solution of large Markov decision processes (MDPs), typically requiring $O(S^3)$ to directly solve the Bellman system of $S$ linear equations (where $S$ is the state space size).
www.math.yale.edu /~mmm82/DynProgRL.html   (710 words)

  
 CDC2000 Markov Decision Processes   (Site not responding. Last check: 2007-08-17)
We introduce the concept of a randomized stationary stopping time which is a mixed extension of the entry time of a stopping region and prove the existence of an optimal constrained pair of stationary policy and stopping time by utilizing a Lagrange multiplier approach.
We consider a Markov decision process with an uncountable state space for which the vector performance functional has the form of expected total rewards.
This is done by considering two extreme situations which occur when a bandit has been played N times; the situation where the decision maker stops learning and the situation where the decision maker acquires full information about that bandit.
www.mm.anadolu.edu.tr /ieeecss/39cdc/CD00S114.HTM   (996 words)

  
 3.6 Markov Decision Processes
Example 3.7: Recycling Robot MDP The recycling robot (Example 3.3) can be turned into a simple example of an MDP by simplifying it and providing some more details.
The agent makes its decisions solely as a function of the energy level of the battery.
Exercise 3.7 Assuming a finite MDP with a finite number of reward values, write an equation for the transition probabilities and the expected rewards in terms of the joint conditional distribution in (3.5).
www.cs.ualberta.ca /~sutton/book/ebook/node33.html   (734 words)

  
 manicwave: November 2002 Archives   (Site not responding. Last check: 2007-08-17)
decisions that humans and computers make on all levels usually have two types of impacts: (i) they cost or save time, money, or other resources, or they bring revenues, as well as (ii) they have an impact on the future, by influencing the dynamics.
In many situations, decisions with the largest immediate profit may not be good in view of future events.
The goal is to reduce time to process, but increasing the value of one control variable may lead to a negative impact based on the value of another.
www.manicwave.com /blog/archives/2002_11.html   (1520 words)

  
 Random walks and markov decision processes - INFORMS Online Discussion
Topic: Random walks and markov decision processes ([an error occurred while processing this directive])
markov process) as driving process for state variables into an infinite
I now that a homogeneous markov chain is needed for finding a stationary
www.informs.org /ubb/Forum1/HTML/000011.html   (120 words)

  
 Planning & Scheduling
"Planning is the process of generating (possibly partial) representations of future behavior prior to the use of such plans to constrain or control that behavior.
The outcome is usually a set of actions, with temporal and other constraints on them, for execution by some agent or agents.
One of the projects you'll find there is Integrated Planning and Scheduling: "In collaboration with the researchers at SRI, we are investigating the development of techniques for tighter integration of planning and scheduling processes.
www.aaai.org /AITopics/html/planning.html   (1891 words)

  
 IntelligentSystems.html
Artificial intelligence researchers have explored in the past couple of years how to exploit the structure of Markov decision process problem to develop tractable solution methods.
b) Describe why Markov decision process problems for elevator scheduling are still difficult to solve despite the recent advances in solution methods and speculate on research directions that have the potential to result in tractable solution methods for such Markov decision process problems.
Recent critics have argued that this view cannot accommodate the facts that the social, cultural, and material environment is essential to human thinking (i.e., human cognition is "embodied" and "embedded").
www-static.cc.gatech.edu /student.services/phd/qualExams/03fall/IntelligentSystems.html   (1724 words)

  
 POMDP - Partially Observable Markov Decision Process : Hajime Fujita   (Site not responding. Last check: 2007-08-17)
I am dealing with a model-based RL scheme for large-scale multi-agent problems with partial observability, and applies it to a card game, "Hearts".
Since this game is a well-defined example of an imperfect information game, it can be approximately formulated as a partially observable Markov decision process (POMDP) for a single learning agent.
To reduce the computational cost, we use a sampling technique in which the heavy integration required for the estimation and prediction can be approximated by a plausible number of samples.
hawaii.aist-nara.ac.jp /~hajime-f   (479 words)

  
 Partially Observable Markov Decision Process   (Site not responding. Last check: 2007-08-17)
This leads to a new kind of model.
We've got states, actions, transitions, and rewards, just like an MDP.
But now, we also have ``observations'' (a set of things that can be perceived by the agent).
www.cs.duke.edu /~mlittman/courses/cps271/lect-21/node5.html   (52 words)

  
 Handbook of Markov Decision Processes
Singular Perturbations of Markov Chains and Decision Processes;
Markov Decision Processes in Finance and Dynamic Options; M.
Applications of Markov Decision Processes in Communication Networks; E.
www.ams.sunysb.edu /~feinberg/MDPHandBook   (73 words)

  
 CAS Publications (1998-2005)
Krishnamurthy, V., Wahlberg, B., and Lingelbach, F. A value iteration algorithm for partially observed markov decision process multi-armed bandits.
In 43rd IEEE Conference on Decision and Control (Bahamas, Dec. 2004), pp.
In Proceedings of the 39th IEEE Conference on Decision and Control (Sydney, Australia, December 2000).
www.cas.kth.se /cas-publ.html   (7321 words)

  
 Yossi Aviv : AF1996a   (Site not responding. Last check: 2007-08-17)
In this paper, we develop a stylized, partially observed Markov decision process (POMDP) framework, to study a dynamic pricing problem faced by sellers of fashion-like goods.
We develop an upper bound approximation for the seller's decision problem and use it to propose a heuristic pricing policy.
We use the approximation and the heuristic to study, and gain insights into several important managerial questions.
www.olin.wustl.edu /faculty/aviv/POMDP02.html   (146 words)

  
 Presentation of MDP toolbox documentation   (Site not responding. Last check: 2007-08-17)
P(s,s',a): probability to be in state s' when system in state s and action a performed by decision maker
reward function R and PR R(s,s',a): reward when system is in state s at decision epoch t and is in state s' at decision epoch t+1, with action a performed by decision maker PR(s,a): reward when system is in state s at decision epoch t and action a performed by decision maker
The documentation pages can be seen with MATLAB navigator (used for the MATLAB help).
www.inra.fr /bia/T/MDPtoolbox/DOCUMENTATION.html   (197 words)

  
 EconPapers: Mean Variance Optimality Criteria for Discounted Markov Decision Process   (Site not responding. Last check: 2007-08-17)
Abstract: The criteria of maximizing expected rewards has been widely used in Markov decision processes following Howard [2].
Recently considerations related to higher moments of rewards have also been incorporated by Jaquette [4] and Goldwerger [1].
This paper considers mean variance criteria for discounted Markov decision processes.
econpapers.repec.org /paper/iimiimawp/246.htm   (257 words)

Try your search on: Qwika (all wikis)

Factbites
  About us   |   Why use us?   |   Reviews   |   Press   |   Contact us  
Copyright © 2005-2007 www.factbites.com Usage implies agreement with terms.