Skip to content

CS Learning

MathFoundationRL

CS Learning

首页
首页
- 前言
- 友链
人工智能
人工智能
- 强化学习
  强化学习
  - MathFoundationRL MathFoundationRL
    Table of contents
    
    简介
    
    框架
    
    我的笔记
杂项
杂项
- 搭建博客
- 搭建 wiki

MathFoundationRL

简介

西湖大学的课程，内容浅显易懂而且涉及到较为详细的推导，个人认为很适合用来作为强化学习的入门课。

框架

MDP 定义
Bellman 方程
需要完整 model 的 DP（包括值迭代和策略迭代）
到无需完整 model 的 MC
引入 TD, Sarsa, Q-learning
当状态空间过大无法使用表格表示时，使用函数近似方法（如 DQN）
最后是直接优化策略的策略梯度方法和结合值函数与策略优化的 Actor-Critic 方法

我的笔记

https://github.com/Tenshi0x0/MathFoundationRL/tree/master/my%20notes/pdfs