Training Flow Matching Models with Reliable Labels via Self-Purification

This repository contains a minimal example demonstrating Self-Purifying Flow Matching (SPFM) in an iPython/Jupyter notebook environment.

Overview

Self-Purifying Flow Matching (SPFM) is a principled approach to filtering unreliable data within the flow-matching framework. SPFM identifies suspicious data using the model itself during the training process, bypassing the need for pretrained models or additional modules. This technique is particularly useful when dealing with noisy labels or mislabeled samples in training datasets.

Getting Started

Prerequisites

# Install required packages
pip install torch numpy matplotlib jupyter

Quick Start

Clone this repository
Open the Jupyter notebook:
```
jupyter notebook toy_example.ipynb
```
Run the cells to see SPFM in action on a toy dataset

Example Usage

See toy_example.ipynb for a complete minimal example demonstrating:

Dataset preparation with synthetic noise
SPFM training loop implementation
Comparison with standard flow matching
Visualization of results

Citation

If you use this code in your research, please cite:

@article{kim2025training,
  title={Training Flow Matching Models with Reliable Labels via Self-Purification},
  author={Kim, Hyeongju and Yu, Yechan and Yi, June Young and Lee, Juheon},
  journal={arXiv preprint arXiv:2509.19091},
  year={2025}
}

Maintainers

Hyeongju Kim - hyeongju@supertone.ai
Yechan Yu - ato@supertone.ai

Contributing

For questions, suggestions, or contributions, please contact the maintainers or open an issue on this repository.

supertone-inc/self-purifying-flow-matching

README