Page 9 / 10
116 posts in total. Keep on posting.
Showing posts 97–108 of 116. Each entry opens locally on this site; legacy Hexo posts link back to their original article at the bottom for reference.
2021
- EN
Reinforcement Learning-Principle-Day8
Reinforcement learning study notes — model-based reinforcement learning and planning with learned dynamics.
- EN
Reinforcement Learning-Principle-Day7
Reinforcement learning study notes — advanced policy optimization: PPO, TRPO, and trust region methods.
- EN
Operating System Memory Address
Notes on operating system memory management — covering virtual addressing, page tables, TLB, and memory allocation strategies.
- EN
Reinforcement Learning-Principle-Day6
Reinforcement learning study notes — policy gradient methods: REINFORCE, Actor-Critic, and A2C algorithms.
- EN
Reinforcement Learning-Principle-Day5
Reinforcement learning study notes — function approximation and the DQN (Deep Q-Network) breakthrough.
- EN
MetaLearning-Standford-Lecture4
Stanford CS 330 Meta-Learning lecture notes — exploring metric learning, Siamese networks, and Matching Networks for few-shot classification.
- EN
Reinforcement Learning Principle Day4
Reinforcement learning study notes — temporal difference learning: TD(0), SARSA, and Q-learning algorithms.
2020
- EN
Reinforcement Learning-Principle-Day3
Reinforcement learning study notes — Monte Carlo methods for prediction and control in model-free settings.
- EN
HHKB's BS and Delete 按钮引起的疑惑
Debugging notes on common deletion-related confusions in programming and system administration.
- EN
MetaLearning-Standford-Lecture3
Stanford CS 330 Meta-Learning lecture notes — covering optimization-based meta-learning methods including MAML and its variants.
- EN
MetaLearning-Standford-Lecture2
Stanford CS 330 Meta-Learning lecture notes — covering learning-to-learn approaches, few-shot learning, and meta-optimization fundamentals.
- EN
Reinforcement Learning-Principle-Day2
Reinforcement learning study notes — covering dynamic programming methods: policy evaluation, policy iteration, and value iteration.