Ming Yin's Blog

A place to explain and share my ideas and thoughts

GRPO From Scratch

This post explains my pytorch implementation of Group Relative Policy Optimization Algorithm.

11 min read · August 19, 2025

2025
DPO From Scratch

This post explains my pytorch implementation of Direct Preference Optimization Algorithm.

7 min read · August 18, 2025

2025
On Reinforcement Learning for Large Language Models

A personal thinking on why reinforcement learning is vital for Large Language Models. [Updated 02/21]

5 min read · January 7, 2025

2025
Flow models for Generative AI

As an alternative to Diffusion Models, Continuous Normalizing Flow Matching is one of the most powerful paradigm for generative AI modeling.

11 min read · September 1, 2024

2024
a post with redirect

you can also redirect to assets like pdf

1 min read · July 4, 2021

2021