Yuhan's blog

Coder.

Sunday, January 05, 2025

MPI - Quick Tutorial

MPI is a C program that can run at multiple processors, and the processors can be at multiple machines. It will SSH into the remote machines...

Sunday, December 08, 2024

Short Explanation of KL Divergence

What is KL divergence? It measures the distance between 2 Gaussian distributions. Be more precise - given 2 distributions P and Q, and let P...

Short Explanation of Variational Autoencoder (VAE) and Controlled VAE

A Variational Autoencoder is an auto encoder with a twist. An Autoencoder is network that takes in a large input (ex: an image), encodes it ...

Monday, November 11, 2024

Study CUDA Programming - Summary

Install Nvidia Driver. Be able to run nvcc compiler. The name of the file matters - the file extension needs to be ".cu" to be co...

Tuesday, September 10, 2024

Key ideas in the book of: "Reinforcement Learning - An introduction" of Sutton & Barto (Part 3)

Planning In the previous chapter of the book, Monte Carlo and Temporal Difference were introduced, where real experience is used to learn a ...

Sunday, August 18, 2024

Key ideas in the book of: "Reinforcement Learning - An introduction" of Sutton & Barto (Part 2)

Temporal Difference vs. Monte Carlo method In Monte Carlo method, a trajectory (episode) has to end with a terminal state. The final return ...

Monday, August 12, 2024

Important Sampling: intuitive explanation

We'd like to use Monte Carlo method to estimate the integral by randomly sampling from a distribution and take their average. But for la...

Saturday, July 06, 2024

Key ideas in the book of: "Reinforcement Learning - An introduction" of Sutton & Barto (Part 1)

This is a summary of the key ideas from Sutton and Barto's book on Reinforcement Learning. (Here is the book http://www.incompleteideas...

View web version

About Me

Yuhan

View my complete profile

Powered by Blogger.