CAST AI, the leading Kubernetes automation platform, has launched AI Enabler. It is an optimization tool that streamlines the deployment of LLMs and drastically reduces operational expenses. AI enabler automatically identifies the LLM that offers the best performance and lowest inference costs. It will then deploy it on CAST AI-optimized Kubernetes clusters.
AI-enabler will review a wide range of LLMs, both open-source and commercial, to ensure that organisations use the best-suited LLM available. It calculates this based on three key factors: cost-effectiveness, quality and performance.
Solving the budget challenge LLMs
The number of LLMs available is also expanding rapidly. It makes it harder for infrastructure teams to identify the best LLM to use. AIOps, MLOps, and DevOps teams often have to select an LLM that they feel best fits their scenario without the rigour and diligence that would ensure that the decision is the best one.
What often happens is that they default to the latest or most familiar LLM. The risk is that, down the line, the costs escalate significantly. It means that what, in a trial, seemed appropriate exceeds the budget when it is fully live.
Laurent Gil, co-founder and CPO at CAST AI, said, “With the increasing availability of LLMs, choosing the right one for your use case, and doing so cost-effectively, has become a real challenge.
“AI Enabler removes that complexity by automatically routing queries to the most efficient models and providing detailed cost insights, helping businesses fully leverage AI at a fraction of the cost. This automated approach allows organizations to scale generative AI solutions across their operations
“Our customers have been asking for a way to harness the power of LLMs without the prohibitive costs of the most popular models. With automated model selection and the ability to launch models locally on spot GPUs, we’ve made large-scale LLM deployment feasible for companies who need real-time insights without the high price tag.”
AI enabler features
CAST.AI identifies four features of AI Enabler
- It will integrate with any Open AI-compatible API endpoint, analysing the cost of specific users and API keys, overall usage patterns, and other factors.
- AI Enabler works with an organisation’s existing technology stack with no re-coding required.
- AI Enabler identifies those with the best cost, performance, and accuracy ratio.
- Deploys the LLM on CAST AI-optimized Kubernetes clusters. In turn, this can help unlock even more Generative AI savings using cost optimization features like autoscaling or bin packing.
The AI Enabler also works alongside the CAST AI Playground. This is a testing resource that compares LLM performance and cost. Users can benchmark and customize configurations for optimal LLM selection without any code adjustments
CAST.AI enabler is available under a free trial, or people can book a live demo to view the solution
Enterprise Times: What does this mean
With AI Enabler, CAST.AI have made it much easier for organisations to identify the best LLM for their organisation to use for each use case. As organisations start to use more and more Kubernetes clusters in production, there are growing concerns about cost overruns.
The number of different LLMs available and the often complex pricing mechanisms mean that it can be very hard to determine the most cost-effective LLMs. Even then, it is not always the lowest option that is the best or even cheapest in the long run. With the AI Enabler, CAST AI provides a tool to help teams avoid bill shock for their AI rollouts.