<--- Back to Details
First PageDocument Content
Mathematics / Mathematical optimization / Dynamic programming / Mathematical analysis / Equations / Operations research / Systems theory / Stochastic control / Bellman equation / Markov decision process / Q-learning / Reinforcement learning
Date: 2015-12-12 00:05:18
Mathematics
Mathematical optimization
Dynamic programming
Mathematical analysis
Equations
Operations research
Systems theory
Stochastic control
Bellman equation
Markov decision process
Q-learning
Reinforcement learning

Increasing the Action Gap: New Operators for Reinforcement Learning Marc G. Bellemare and Georg Ostrovski and Arthur Guez Philip S. Thomas∗ and R´emi Munos Google DeepMind {bellemare,ostrovski,aguez,munos}@google.com;

Add to Reading List

Source URL: psthomas.com

Download Document from Source Website

File Size: 694,77 KB

Share Document on Facebook

Similar Documents

Journal of Global Optimization manuscript No. (will be inserted by the editor) Stabilizer-based symmetry breaking constraints for mathematical programs Leo Liberti · James Ostrowski

Journal of Global Optimization manuscript No. (will be inserted by the editor) Stabilizer-based symmetry breaking constraints for mathematical programs Leo Liberti · James Ostrowski

DocID: 1v0h9 - View Document

OPTIMA 88 Mathematical Optimization Society Newsletter Philippe L. Toint  MOS Chair’s Column

OPTIMA 88 Mathematical Optimization Society Newsletter Philippe L. Toint MOS Chair’s Column

DocID: 1uTp0 - View Document

The Annals of Probability 2004, Vol. 32, No. 1B, 1030–1067 © Institute of Mathematical Statistics, 2004 A STOCHASTIC REPRESENTATION THEOREM WITH APPLICATIONS TO OPTIMIZATION AND OBSTACLE PROBLEMS

The Annals of Probability 2004, Vol. 32, No. 1B, 1030–1067 © Institute of Mathematical Statistics, 2004 A STOCHASTIC REPRESENTATION THEOREM WITH APPLICATIONS TO OPTIMIZATION AND OBSTACLE PROBLEMS

DocID: 1sOP9 - View Document

Optimization of Electrical Production The production of electricity in France is optimized everyday with the help of a mathematical software developed at Inria, in collaboration with EDF R&D. Substantial performance is a

Optimization of Electrical Production The production of electricity in France is optimized everyday with the help of a mathematical software developed at Inria, in collaboration with EDF R&D. Substantial performance is a

DocID: 1rxID - View Document

Timed-Elastic-Bands for Time-Optimal Point-To-Point Nonlinear Model Predictive Control

Timed-Elastic-Bands for Time-Optimal Point-To-Point Nonlinear Model Predictive Control

DocID: 1ru4i - View Document