-
a post with diagrams
an example of a blog post with diagrams
-
Optimal offline RL with the unified model-based framework
A model-based framework + singleton absorbing MDP technique achieves the optimal rate for several challenging offline tasks.
-
a distill-style blog post
an example of a distill-style blog post and main elements
-
A Brief Summary of Upper Bounds for Bandit Problems
This post summarizes the regret analysis of the Exploration-First Algorithm, the Upper Confidence Bound (UCB) Algorithm for the multi-armed bandits (MAB) problems and the LinUCB Algorithm for linear Bandits.
-
A Brief Introduction to Influence Funtion Technique
Influence function technique is powerful in that it provides a way to calculate efficiency bound for the semiparameteric estimation problems.