++++++

Baseten Hybrid: control and flexibility in your cloud and ours

Get the performance of a managed service in your own VPC, with seamless overflow to Baseten Cloud.

Trusted by top engineering and machine learning teams
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo
  • Logo

++++High-performance inference with seamless overflow

Flex your cloud

Maintain SLAs during traffic spikes, avoid vendor lock-in, and leverage existing cloud credits with our effortless multi-cloud routing.

Cut latency

With rapid cold starts and tailored model performance, our customers achieve lower overall latency and faster time to first token.

Designed for compliance

Keep sensitive workloads in your VPC, and lean on the SOC 2 Type II, HIPAA, and GDPR compliance of Baseten Cloud.

Choosing Baseten Hybrid, Self-hosted, or Cloud

Baseten Hybrid
Baseten Hybrid
Baseten Self-hosted
Baseten Self-hosted
Baseten Cloud
Baseten Cloud
Feature
Data control
Full data control in your VPC; managed data security on Baseten Cloud
Full data control
Managed data security; we never store model inputs or outputs
Data residency requirements
Region-locked data and deployments with multi-region support
Region-locked data and deployments
Multi-region support with global deployment options
Compute capacity
Leverage existing resources or Baseten compute for overflow
Leverage existing in-house resources
Leverage on-demand compute with SOTA GPUs
Cost efficiency
Use in-house compute whenever available for optimized costs
Utilize dedicated resources without extra spend on hardware
Gain cost-effective, on-demand compute
Integration with internal systems
Custom or out-of-the-box integrations
Custom or out-of-the-box integrations
Easy integration via Baseten's ecosystem
Performance optimization
SOTA on-chip model performance and low network latency
SOTA on-chip model performance and low network latency
SOTA on-chip model performance and low network latency
Scalability
High, tailored scalability with flex capacity on Baseten Cloud
High, tailored scalability
High, flexible scaling options
Security and compliance
Adhere to custom policies and our SOC 2 Type II, HIPAA, and GDPR compliance
Adhere to custom organizational policies
SOC 2 Type II certified, HIPAA compliant, and GDPR compliant by default
Support and maintenance
Comprehensive support and managed services
Comprehensive support and managed services
Comprehensive support and managed services
Utilization of existing cloud commits
Use credits or commits
Use credits or commits
Spend down existing cloud commits
Baseten Hybrid

Feature

Data control
Full data control in your VPC; managed data security on Baseten Cloud
Data residency requirements
Region-locked data and deployments with multi-region support
Compute capacity
Leverage existing resources or Baseten compute for overflow
Cost efficiency
Use in-house compute whenever available for optimized costs
Integration with internal systems
Custom or out-of-the-box integrations
Performance optimization
SOTA on-chip model performance and low network latency
Scalability
High, tailored scalability with flex capacity on Baseten Cloud
Security and compliance
Adhere to custom policies and our SOC 2 Type II, HIPAA, and GDPR compliance
Support and maintenance
Comprehensive support and managed services
Utilization of existing cloud commits
Use credits or commits

Get the best of Self-hosted and Cloud deployments

Flex on-demand

Utilize internal resources whenever they’re available, seamlessly transition to Baseten Cloud whenever necessary.

Control data residency

Host in your VPC, or use a dedicated deployment on Baseten Cloud. We never store model inputs or outputs.

Auto-scale to peak demand

Future-proof your product against traffic bursts with our optimized autoscaling and blazing-fast cold starts.

Meet compliance

Store data where you need it, and lean on the SOC 2 Type II, HIPAA, and GDPR compliance of Baseten Cloud.

Optimize costs

Fully use existing hardware or cloud commits, and take advantage of Baseten’s on-demand pricing for overflow.

Ship faster

Save time with out-of-the-box performance optimizations and engineers dedicated to hitting your performance targets.

Key Benefits

++++
Reliable performance at peak demand

Stress-free traffic spikes

Meet SLAs without investing in additional GPUs. With full cloud elasticity, we manage your workloads wherever there's capacity.

Engineered for speed

Along with our out-of-the-box performance optimizations, get the white glove treatment of our dedicated engineering teams for tailored response times at the millisecond level.

Enterprise compliance

You don't need to sacrifice performance for compliance. Keep sensitive workloads on-prem, and rely on our single-tenant, HIPAA and GDPR compliant, SOC 2 Type II certified infra only when needed.

Get started with Baseten Hybrid

Guides and examples

Learn about the Baseten platform

Learn how Baseten infra operates in both Cloud and Self-hosted deployments.

Deploy a custom model

Experience the Baseten UI before customizing your deployments.

Security and compliance

Learn how we ensure security and compliance in Hybrid, Self-hosted, and Cloud deployments.