DRL Lecture 7 – Sparse Reward – notes – Hung-yi Lee
深度强化学习中的奖励稀疏
To solve sparse reward problems, three directions:
1. Reward Shaping
环境有真正的reward,但自己设计额外的reward
Ex: for a child:
Take “Play”, rt+1=1r_{t+1} = 1rt+1=1, rt+100=−100r_{t+100} = -100rt+100=−100
Take “Study”, rt+1=−1r_{t+1} = -1rt+1=−1, rt+100=100r_{t+100} = 100rt+100