Pricing Tiers
Pay-as-you-go
For individual developers and hobbyists
No Platform Fee
$50 free credits at signup
Platform, Web UI, and SDK
All pre-trained models
Everything in Pay-as-you-go and…
24-hour email support SLA
Enterprise
For organizations with ongoing, large-scale needs
Custom Platform Fee
Pay for compute time directly
Everything in Team and…
Custom support packages
Custom Integrations
Available Models
We select frontier open-source models that are very adept at typical batch inference tasks. If you need help finding the right model for you task, please reach out to team@sutro.sh and we would be happy to help.
Text and Vision Models
Model ID
Context Window
131,072
131,072
131,072
262,144
32,768
32,768
262,144
131,072
131,072
131,072
131,072
131,072
131,072
131,072
131,072
Reasoning Models
Model ID
Context Window
262,144
32,768
32,768
262,144
131,072
Embedding Models
Model ID
Context Window
Custom Models
We also offer support for custom and fine-tuned models on a per-request basis. To discuss such needs, please reach out at team@sutro.sh.
Notes
We serve quantized versions for some of the models we offer. This is done to pass on further time and cost savings to users, however if you have a workload that could benefit from full precision inference - we'd like to learn more - please reach out to team@sutro.sh.
Average token prices are based on blended input and output costs, weighted according to representative batch inference workload shapes. Actual pricing will depend on total usage. We encourage users to estimate costs ahead of job submission using the dry run functionality described in the documentation. For questions on pricing, please reach out to team@sutro.sh.