Incremental Truncated LSTD

Gehring, Clement; Pan, Yangchen; White, Martha

Computer Science > Machine Learning

arXiv:1511.08495 (cs)

[Submitted on 26 Nov 2015 (v1), last revised 18 Nov 2016 (this version, v3)]

Title:Incremental Truncated LSTD

Authors:Clement Gehring, Yangchen Pan, Martha White

View PDF

Abstract:Balancing between computational efficiency and sample efficiency is an important goal in reinforcement learning. Temporal difference (TD) learning algorithms stochastically update the value function, with a linear time complexity in the number of features, whereas least-squares temporal difference (LSTD) algorithms are sample efficient but can be quadratic in the number of features. In this work, we develop an efficient incremental low-rank LSTD({\lambda}) algorithm that progresses towards the goal of better balancing computation and sample efficiency. The algorithm reduces the computation and storage complexity to the number of features times the chosen rank parameter while summarizing past samples efficiently to nearly obtain the sample complexity of LSTD. We derive a simulation bound on the solution given by truncated low-rank approximation, illustrating a bias- variance trade-off dependent on the choice of rank. We demonstrate that the algorithm effectively balances computational complexity and sample efficiency for policy evaluation in a benchmark task and a high-dimensional energy allocation domain.

Comments:	Accepted to IJCAI 2016
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:1511.08495 [cs.LG]
	(or arXiv:1511.08495v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1511.08495

Submission history

From: Yangchen Pan [view email]
[v1] Thu, 26 Nov 2015 20:37:09 UTC (103 KB)
[v2] Wed, 3 Feb 2016 18:40:20 UTC (497 KB)
[v3] Fri, 18 Nov 2016 05:58:06 UTC (511 KB)

Computer Science > Machine Learning

Title:Incremental Truncated LSTD

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Incremental Truncated LSTD

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators