This repo contains all of the code used in the entire development process for Seahorse, a VLLM based on Phi3.5 and CLIP.
Features | Experiments | Evaluation | Project Structure
- Built on Phi3.5 and CLIP, supports arbitrary interleaved images and text
- Optimized for training on a single GeForce RTX 4090
- Easily extensible for new research experiments, supports optuna
- Comprehensive, standardized evaluation using lmms-eval
This project uses a Makefile
to manage tasks. Tasks in the Makefile rely on uv for dependency management.
To run an experiment, use the run-experiment
task. For example:
make run-experiment pretrain
Tip: This is equivalent to running:
uv run python seahorse/experiments/run_experiment.py pretrainIf you prefer not to use
uv
, you can manually install the project dependencies and then run:python seahorse/experiments/run_experiment.py pretrain
This will look for the function pretrain()
in the experiment registry and execute it to create a (set of) experiment configuration(s). Then for each of those configurations, a training run will be launched.
Evaluation is performed via the lmms-eval
library.
seahorse/
├── seahorse/ # Main Python package for the project
│ ├── config/ # HF-style configuration files for SeahorseModel
│ ├── data/ # Data preprocessing and loading (e.g. datasets, collators, etc.)
│ ├── eval/ # Evaluation code and utilities (e.g. benchmark scoring, etc.)
│ ├── experiments/ # Experiment configuration and launching
│ ├── models/ # Model architectures and construction
│ ├── train/ # Training script and custom HF Trainer class
│ └── utils/ # Misc utility tools (rng, profiling, etc.)
├── tests/ # Sanity-preserving unit tests for the project
└── Makefile # Simple task management (`run-experiment`, `test`, etc.)
To run the unit tests for the project, use the test
task:
make test
To run a specific test, run
make test TEST_ARGS="-k test_seahorse_tokenizer"