Reinforcement learning by reward-weighted regression for operational space control

Reference

Reinforcement learning by reward-weighted regression for operational space control, Jan Peters, Stefan Schaal. Proceedings of the 24th international conference on Machine learning(2007)

Publication

Abstract

Many robot control problems of practical importance, including operational space control, can be reformulated as immediate reward reinforcement learning problems. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.

Content citing this item

Pill

Advantage-Induced Policy Alignment

Building on the classic results on reward weighted regression and its more recent adaptation to deep learning, a new algorithm called …

Reinforcement Learning

Jul 14, 2023

All works referenced in our site...