- https://owlite.ai
- OwLite is a low-code AI model compression toolkit for machine learning models.
- Visualizes computational graphs, identifies bottlenecks, and optimizes latency, and memory usage.
- Also includes an auto-optimization feature and a device farm management system for evaluating optimized models.
- You can visualize AI models using OwLite's editor function.
- You can easily understand the structure of the entire model at a glance through GUI,
- and at the same time, you can easily obtain detailed information about individual nodes.
- SqueezeBits' engineers provide recommended quantization settings optimized for the model based on their extensive experience with quantization.
- This allows you to obtain a lightweight model while minimizing accuracy drop.
- Based on the visualized model, you can apply quantization to each node directly.
- This allows you to finely adjust the desired performance and optimization.
- You can perform latency benchmarks within OwLite. This allows you to easily compare existing models and models you have edited, and determine the point at which to download the result.
Using pip (Recommended)
pip install owlite --extra-index-url https://pypi.squeezebits.com/
Please check [OwLite Documentation] for user guide and troubleshooting examples.
Explore [OwLite Examples], a repository showcasing seamless PyTorch model compression into TensorRT engines. Easily integrate OwLite with minimal code changes and explore powerful compression results.
Please contact [email protected] for any questions or suggestions.