Together AI Enterprise Platform accelerates GenAI (Image Credit: Ian Murphy with Microsoft Designer)Together AI has launched a new platform to help enterprises boost their GenAI models. The Together Enterprise Platform looks to manage the entire GenAI lifecycle and optimise it for model performance, GPU utilisation, and cost.

At launch, the company is claiming that it delivers 2-3x faster inference and up to 50% lower GPU operational costs on both cloud and on-premises infrastructure. It’s a bold claim and an improvement on the acceleration speeds, that the company already claims on its website. It will be interesting to see how many new customers that helps it to attract.

The company allows customers to deploy its solutions on their on-premises infrastructure or on their preferred cloud platform, AWS, Azure, GCP or OCI. It supports all the open-source models and customers can import them into the platform for free and then take advantage of Together AI’s tools.

Using its inference tool across models, taking advantage of fine-tuning, or using the company’s GPU clusters comes at a price, which is clearly documented on its website.

What is in the Together Enterprise Platform?

The Together Enterprise Platform builds on the four solutions that the company already offers – Inference, Fine-Tuning, Customer Models and GPU Clusters. It delivers a single platform that gives customers access to all those capabilities and more. The more is the integrated nature of the platform and the ease of use that it delivers.

In the announcement, Together AI highlighted five key features of the new platform. They are:

  • Deploy in any environment, with full control: Deploy on Together Cloud, your Virtual Private Cloud (VPC), or on-premises. Your data remains within your firewall.
  • Continuous model optimization: Implement advanced optimization techniques like auto fine-tuning and adaptive speculators to continuously improve model performance over time.
  • Access 200+ models or bring your own: Choose from leading model families like Llama and Mixtral, or use your own custom models for inference and fine-tuning. We offer a wide range of model types, including chat, multimodal, embeddings, re-rank, and code.
  • Enhanced GPU orchestration: Efficiently manage, scale, and orchestrate GPU resources with job scheduling, auto-scaling, and traffic control.
  • New Enterprise plans: We’ve added new ‘Scale’ and ‘Enterprise’ plans to enable seamless scaling. Our new Enterprise plan includes unlimited rate limits and dedicated support.

One of the key points of this new platform is support for the GenAI lifecycle. Together AI supports the fine-tuning of data but not the initial creation of GenAI models. It needs to clarify where its support for the GenAI lifecycle starts.

It is similarly unclear how the company deals with data privacy. The documentation lacks instructions on how to identify and mark data as protected and how to prevent models from inferring sensitive data by applying data protection. At present, it relies on data staying in customers’ environments to fulfil data privacy and security requirements.

Performance is a major benefit of the Together Enterprise Platform

The company says the “Together Inference Engine is the fastest inference engine that can be deployed in any environment.” It goes on to say that it is “consistently 2-3x faster than hyperscalers on the Llama 3.1 model family (8-B, 70-B, 405-B).”

Support for that claim comes from independent testing carried out by Artificial Analysis on the Together AI platform as recently as September 17, 2024. The tests used the Llama 3.1-405-B benchmark and shows a vast difference between the Together AI Turbo mode, Databricks, Azure and AWS.

Enterprise Times: What does this mean?

As organisations continue to investigate the use of GenAI, they realise it comes with costs and demands that require planning. They want GenAI at the least cost, lowest risk, data security, compliance, highest performance and maximum flexibility.

Together AI offers all of that to customers. Its performance over competitive platforms also contributes to a lower-cost model. Its support for open-source AI models also contributes to lower cost and flexibility. Customers’ ability to run on their on-premises infrastructure and virtual private clouds contributes to security, risk, and compliance.

It will be interesting to see when it starts engaging with customers at the start of their AI journey. That includes helping them choose the right model for the type of AI that they want and ingesting large data sets. The current limitations on fine-tuning are okay but far from helpful when doing the initial data import.

The company also focuses on large language models. However, there is a shift to connected small language models that work well for internal GenAI solutions. Will it add functionality to help customers build, deploy and run connected SLMs?

LEAVE A REPLY

Please enter your comment!
Please enter your name here