Demonstrate Speculative Sampling using bloom 560m and 7b1 models.
Support KV Cache Optimization.
Only works for batch size as 1.
Demonstrate Speculative Sampling using bloom 560m and 7b1 models.
Support KV Cache Optimization.
Only works for batch size as 1.