Learn, Build, Deploy
Baseten supports billions of custom, fine-tuned LLM calls per week from OpenEvidence, serving high-stakes medical information to healthcare providers in every major healthcare facility in the country. If you see a doctor today, chances are that they are leveraging OpenEvidence for trustworthy, up-to-date medical information at their fingertips. Baseten's tireless dedication to reliability and deep support at scale has proven up to the task of supporting this at times literally life-or-death mission.
By fine-tuning Qwen models on Baseten Training, we exceeded the intelligence of closed-source models, while cutting overall inference costs by 60%. This also delivered a dramatic speedup, reducing p90 latency from 2.2 seconds to just 250 milliseconds—opening the door to entirely new, latency-sensitive LLM use cases.
Blog
All postsIntroducing Baseten Loops

DFlash: 3x faster LLM inference










