v0.1.1
subwiz was crashing when using a large number of prediction due to out-of-memory. we now do inference in batches of 500 sequences at a time. in the future we could:
- make batch size configurable
- make batch size set automatically from memory size
- use quantization to decrease memory usage