Factbites
 Where results make sense
About us   |   Why use us?   |   Reviews   |   PR   |   Contact us  

Topic: Reinforcement learning


Related Topics

In the News (Tue 1 Dec 09)

  
  Reinforcement learning - Scholarpedia
Reinforcement learning (RL) is learning by interacting with an environment.
The reinforcement signal that the RL-agent receives is a numerical reward, which encodes the success of an action's outcome, and the agent seeks to learn to select actions that maximize the accumulated reward over time.
Reinforcement learning is also reflected at the level of neuronal sub-systems or even at the level of single neurons.
www.scholarpedia.org /article/Reinforcement_learning   (5188 words)

  
 Reinforcement Learning
That is, before learning, the agent may not know what will happen when it takes a particular action in a particular state, but the only relevant information for deciding what action to take is the current state, which the agent does have access to.
The goal of reinforcement learning is to figure out how to choose actions in response to states so that reinforcement is maximized.
Early in learning, it is better to explore because the knowledge the agent has gained so far is not very reliable and because a number of options may still need to be tried.
www.cs.indiana.edu /~gasser/Salsa/rl.html   (2197 words)

  
 REINFORCEMENT LEARNING WITH
In a reinforcement learning system, the function f(x, u) typically represents the utility of performing action u in state x, so the u that maximizes f(x,u) is the optimal action to perform in state x.
Reinforcement was proportional to the pole angle squared, with an additional negative reinforcement when the pole exceeded 12 degrees from vertical.
The learning system was allowed to learn for only 60 seconds of simulated time, during which a random action in the range [-10,10] newtons was chosen with uniform probability on each time step.
www.leemon.com /papers/wirefit/wirefit.html   (3775 words)

  
 Introduction to Reinforcement Learning
Reinforcement Learning is what a computer-based agent does when it is placed in an unexplored environment.
Reinforcement is usually considered to be a separate input line to the agent that supplies a real-valued response to agent's actions.
Second difference, is that reinforcement usually depends on the combination of agent's action and the state it was performed in, whereas a sensor reading never depends on the performed action.
www.scs.carleton.ca /%7Edbatalov/REPORT.htm   (1831 words)

  
 Positive Reinforcement Tutorial
Note that if positive reinforcement were being taught in a standard course, much more background material would be provided about the concept to give students an appropriate context in which to understand the concept, relate it to other concepts, and eventually to be able to apply the concept.
The first item is an example of positive reinforcement because presentation of attention was dependent upon the target behavior of being on-feet, and this resulted in an increase in the level of the target behavior.
The first item is an example of positive reinforcement, because presentation of points was dependent on flexing the elbow, and the procedure caused an increase in the level of flexing the elbow.
server.bmod.athabascau.ca /html/prtut/reinpair.htm   (2007 words)

  
 Solving Optimal control and Search Problems with Reinforcement Learning in MATLAB
Reinforcement learning methods (Bertsekas and Tsitsiklis, 1995) are a way to deal with this lack of knowledge by using each sequence of state, action, and resulting state and reinforcement as a sample of the unknown underlying probability distribution.
The reinforcement learning agent is learning a prediction of the number of steps required to leave the valley from every state, where a state consists of a position and velocity of the car.
The resulting environment for experimenting with reinforcement learning methods will be very helpful, both to students wanting to learn more about reinforcement learning and to researchers wanting to study novel extensions of current reinforcement learning algorithms or to apply reinforcement learning to new tasks.
www.cs.colostate.edu /~anderson/res/rl/matlabpaper/rl.html   (1569 words)

  
 Neuro-pilot Demo
This is a project to try to make an artificial neural network learn something challenging and interesting.
This project is an example of a reinforcement learning problem.
Reinforcement learning problems are generally harder than those that work by showing examples (supervised learning).
freespace.virgin.net /michael.fairbank/neuropilot/index.html   (796 words)

  
 Reinforcement-Learning Model
It can learn to do this over time by systematic trial and error, guided by a wide variety of algorithms that are the subject of later sections of this paper.
Another difference from supervised learning is that on-line performance is important: the evaluation of the system is often concurrent with learning.
On the other hand, reinforcement learning, at least in the kind of discrete cases for which theory has been developed, assumes that the entire state space can be enumerated and stored in memory--an assumption to which conventional search algorithms are not tied.
www.cs.cmu.edu /afs/cs/project/jair/pub/volume4/kaelbling96a-html/node2.html   (543 words)

  
 Tor's Reinforcement Learning Simulator
Reinforcement learning involves the update of an action policy (usually represented as a parameterized function over the state-space of the problem) using a scalar reward signal.
In their simplest manifestation reinforcement learning algorithms basically try to determine the discounted expected reward for taking a given action from a given state by exploring the state-space using an "informed random walk".
Reinforcement learning was proposed by Sutton and Barto as a heuristic method motivated by behavior psychology.
www.eecg.utoronto.ca /%7Eaamodt/BAScThesis/RLsim.htm   (880 words)

  
 Q530: Reinforcement Learning
The reinforcement and next-state functions are not normally known to the agent, and they depend only on the current state and action.
But the sum of all future reinforcements may be infinite when there is no terminal state, and besides, we may want to weight the future less than the here-and-now, so instead a discounted cumulative reinforcement is normally used: future reinforcements are weighted by a value γ between 0 and 1 (see below for mathematical details).
This sets the Q-value to be the sum of the reinforcement just received and the discounted best Q-value for the next state, that is, what the agent currently believes the Q-value for the next state to be (what is stored in its lookup table).
www.indiana.edu /~gasser/Q530/Notes/rl.html   (2355 words)

  
 2 Reinforcement Learning
A standard usage of reinforcement learning is to have some known desired state y, at which rewards are given, leaving open what states the agent might pass through on the way to y.
Q-learning is an attractive method of learning because of the simplicity of the computational demands per timestep, and also because of this proof of convergence to a global optimum, avoiding all local optima.
Learning stops for all other states once the absorbing state is entered, unless there is some sort of artificial reset of the experiment.
www.compapp.dcu.ie /~humphrys/PhD/ch2.html   (3683 words)

  
 Reinforcement Learning
Reinforcement learning is one method developed to deal with such situations.
Reinforcement learning (RL) is a kind of supervised learning in that some feedback from the environment is given.
Learning from interaction with our environment is a fundamental idea underlying most theories of learning.
www.willamette.edu /~gorr/classes/cs449/Reinforcement/reinforcement0.html   (439 words)

  
 Reinforcement Learning for Robots Using Neural Networks — CiteSeerX citation query
A straightforward approach to the curse of dimensionality in reinforcement learning and dynamic programming is to replace the lookup table with a generalizing function approximator such as a neural net.
We present a new approach to reinforcement learning in which the policies considered by the learning process are constrained by hierarchies of partially specified machines.
Reinforcement learning (RL) is based on the idea that the tendency to produce an action should be strengthened (reinforced) if it produces favorable results, and weakened if it produces unfavorable results.
citeseerx.ist.psu.edu /showciting?cid=41328   (1732 words)

  
 IIS Corp. Fuzzy Reinforcement Learning
Reinforcement learning is learning what to do -- how to map situations to actions -- so as to maximize a numerical reward signal.
The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them.
The only way to learn anything at all on these tasks is to generalize from previously experienced states to ones that have never been seen.
www.iiscorp.com /projects/fuzzyRL   (339 words)

  
 Segismundo S. Izquierdo, Luis R. Izquierdo and Nicholas M. Gotts: Reinforcement Learning Dynamics in Social Dilemmas
Macy and Flache (2002) study a variant of Bush and Mosteller's (1955) linear stochastic model of reinforcement learning; this variant is a particular type of a wider class of aspiration-based reinforcement learning models (Bendor, Mookherjee and Ray 2001a).
Reinforcement learners interact with their environment and use their experience to choose or avoid certain actions based on their consequences.
Since learning rates are high, the movement towards the SRE associated with such a mutually satisfactory outcome takes place by large steps, so only a few coordinated moves are sufficient to approach the SRE so much that escape from its neighbourhood becomes very unlikely.
jasss.soc.surrey.ac.uk /11/2/1.html   (8042 words)

  
 Temporal Difference Learning and TD-Gammon
The reinforcement learning paradigm has held great intuitive appeal and has attracted considerable interest for many years because of the notion of the learner being able to learn on its own, without the aid of an intelligent "teacher," from its own experience at attempting to perform a task.
Another problem with many of the traditional approaches to reinforcement learning is that they have been limited to learning either lookup tables or linear evaluation functions, neither of which seem adequate for handling many classes of real-world problems.
Finally, non-deterministic games have the advantage that the target function one is trying to learn, the true expected outcome of a position given perfect play on both sides, is a real-valued function with a great deal of smoothness and continuity, that is, small changes in the position produce small changes in the probability of winning.
www.research.ibm.com /massive/tdl.html   (7500 words)

  
 Reinforcement Learning
The automaton learns by altering the probabilities associated with the actions in response to the reinforcement signal.
To improve generalization of Q learning, a possible idea is to use a neural network as a substitute for the table.
The novel aspects of Q learning is that it assumes no knowledge about state transition and reward functions, which must be learned from the environment.
www.cise.ufl.edu /~fu/Lecture/Learn/reinforcement-fu.html   (945 words)

  
 A. Perez-Uribe Introduction to Reinforcement learning
During learning, the adaptive system tries some actions (i.e., output values) on its environment, then, it is reinforced by receiving a scalar evaluation (the reward) of its actions.
One key aspect of reinforcement learning is a trade-off between exploitation and exploration [4].
Maze learning is presented here as an abstraction of the real navigation problem in mobile autonomous robots: We have an a-priori partition of the sensor space of the robot, a perfect vision, and a discrete world.
www.geocities.com /fastiland/RL/sarsa.html   (939 words)

  
 B551: Reinforcement Learning
More precisely, an optimal Q value for a given state and action is the sum of all reinforcements received if that action is taken in the state, and then the agent follows the optimal policy specified by the other Q values.
The "new" value is combined with the "old" using the learning rate to give the updated Q value appearing in the next line of the chart.
Error-driven learning: for the selected action, the target is the "new" Q value from the Q learning rule.
www.cs.indiana.edu /classes/b551-gass/Notes/reinforcement.html   (655 words)

  
 Tor's Reinforcement Learning Simulator
Reinforcement learning involves the update of an action policy (usually represented as a parameterized function over the state-space of the problem) using a scalar reward signal.
In their simplest manifestation reinforcement learning algorithms basically try to determine the discounted expected reward for taking a given action from a given state by exploring the state-space using an "informed random walk".
Reinforcement learning was proposed by Sutton and Barto as a heuristic method motivated by behavior psychology.
www.eecg.toronto.edu /~aamodt/BAScThesis/RLsim.htm   (872 words)

  
 Reinforcement Learning: A Survey
This paper surveys the field of reinforcement learning from a computer-science perspective.
Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment.
It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.
www.cs.washington.edu /research/jair/volume4/kaelbling96a-html/rl-survey.html   (90 words)

  
 CS395T: Reinforcement Learning: Theory and Practice -- Fall 2004
Whether we are learning to drive a car or to hold a conversation, we are all acutely aware of how our environment responds to what we do, and we seek to influence what happens through our behavior.
The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them.
The emphasis is to be on empirically analyzing various learning algorithms and reporting on the results.
www.cs.utexas.edu /%7Epstone/Courses/395Tfall04   (1014 words)

  
 [No title]
Reinforcement learning has shown a lot of promise in recent years as a highly effective approach for building autonomous learning agents.
Most work on reinforcement learning (and closely related decision-theoretic planning) is based on propositional or attribute-value representations of the state and actions and is inapplicable to domains with relational structure without extensive feature engineering.
Given the renewed interest in relational representations in machine learning and AI, this appears to be an appropriate time to encourage efforts to integrate these approaches into a comprehensive framework that includes expressive representations, inference, and action execution.
eecs.oregonstate.edu /research/rrl   (462 words)

  
 Black Jack and Reinforcement Learning
Learning with a teacher systems are not very useful since the target outputs for a given stage of the game are not known.
The learning algorithm it may use is called the SARSA algorithm, a reinforcement learning algorithm introduced by G.Rummery and M.Niranjan [2].
In the first window Learning you may select the different learning options, that is, the number of episodes to train the computer and the number of and games per episode.
lslwww.epfl.ch /~anperez/BlackJack/classes/RLJavaBJ.html   (781 words)

  
 Reinforcement Learning: A Survey
This paper surveys the field of reinforcement learning from a computer-science perspective.
Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment.
It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.
www.cs.cmu.edu /afs/cs/project/jair/pub/volume4/kaelbling96a-html/rl-survey.html   (90 words)

  
 Reinforcement Learning Tetris Example
TD learning is more likely to improve learning in an environment with more states and a lower branching factor.
In most learning situations, an agent must decided whether to choose what it thinks is the best action, or explore a different action in the hopes of finding a better solution.
Our Tetris example shows a straightforward reinforcement learning solution to a well known and reasonably sized problem where there isn't a much more obvious and better solution that should be used instead.
www.melax.com /tetris.html   (1278 words)

  
 Reinforcement Learning in Real-Time Strategy Games Using Case-based Reasoning — AiGameDev.com
Reinforcement learning (RL) is set of algorithms for solving problems using positive or negative feedback from the environment.
The idea of transfer learning, or learning on a set of unseen but related tasks, would seem appropriate for use in game AI in general (let alone RTS games), particularly when the problem of learning incorrect actions is tempered using reinforcement learning-based utility values (as outlined in the paper).
However, using rewards for the reinforcement learning algorithm based on selected (game dependent) state features may prove to be a limitation of the current implementation of this approach, as only features corresponding to the game agent and a single opponent are used.
aigamedev.com /theory/transfer-learning-rts   (1180 words)

  
 Glossary of Terminology in Reinforcement Learning   (Site not responding. Last check: )
Note that not every reinforcement learning agent uses a model of its environment.
Monte Carlo methods - A class of methods for learning of value functions, which estimates the value of a state by running many trials starting at that state, then averages the total rewards received on those trials.
The key distinguishing feature of RL methods is that they learn policies indirectly, by instead learning value functions.
web.cps.msu.edu /rlr/terms.html   (723 words)

Try your search on: Qwika (all wikis)

Factbites
  About us   |   Why use us?   |   Reviews   |   Press   |   Contact us  
Copyright © 2005-2007 www.factbites.com Usage implies agreement with terms.