A PyTorch implementation of Bayesian Flow Networks
Currently using for a non causal version of LLAMA2.
I am going to be using this repository to explore training dynamics of this new class of models. I will maintain a minimal implimentation in the Minimal.ipynd for a simple BFN implimentation. Everything else is an early work in progress.
- Discrete model with continuous-time loss, training and sampling (completed)
- SOTA performance on XOR dataset
- Tiny Stories 15m LLAMA2 Initial Code
- Tiny Stories weights (training)
- Wiki Text8 Dataset
- Bayesian Flow GPT-2 Scale
- Fancy Visuals