Low-code AI platform vendor Predibase has released new research looking at Large Language Model (LLM) adoption. The report is titled “Beyond the Buzz: A Look at Large Language Models in Production.” The report offers some sober reading for those embracing or planning to embrace LLMs.
Piero Molino, co-founder and CEO of Predibase, said. “It is now open season for Large Language Models (LLMs). Thanks to the widespread recognition of OpenAI’s ChatGPT, businesses are in an arms race to gain a competitive edge using the latest AI capabilities. Still, they require more customized LLMs to meet domain-specific use cases.
“This report highlights the need for the industry to focus on the real opportunities and challenges as opposed to blindly following the hype.”
What do we learn from the report?
According to Predibase, the key takeaways break into four areas:
- Less than a quarter of enterprises are comfortable using commercial LLMs. Roughly a third (33%) cite concerns about sharing sensitive or proprietary data with commercial LLM vendors, leading to increased interest in privately hosted, open-source alternatives.
- Open-source LLMs are gaining momentum. Nearly 77% of respondents either don’t use or don’t plan to use commercial LLMs beyond prototypes in production, citing concerns about privacy, cost, and lack of customization, leading to an uptick in open-source alternatives. Meta, for example, has moved away from building closed-source LLMs like LLaMA-1, replacing it with LLaMA-2, available as open-source and free for commercial and research applications.
- While generative AI use cases remain popular, enterprises see the potential of other applications to provide business value. Information Extraction is the second most popular use case (selected by 32.6% of respondents). This involves leveraging LLMs to convert unstructured data like PDF documents or customer emails into structured tables for aggregate analytics. Next was Q&A and Search (15.2% of respondents), the brain in chatbots that provides accurate and relevant responses to user queries in real-time.
- Organizations are turning to customized LLMs to achieve more accurate and tailored results. Most teams plan to customize their LLMs by fine-tuning (32.4%) or reinforcement learning with human feedback (27%). The roadblocks teams face with fine-tuning continue to be a lack of data (21%) and the overall complexity of the process like managing infrastructure (46%).
LLM adoption is still about experimentation
A closer look at the report throws up some other interesting findings. For example, 44% of those who have adopted LLM have done so for experimentation only. Only 14% have LLMs in production, with 13% having 1 or 2, and just 1% having more than 2 LLMs in use.
What is holding companies back? That’s a good question. Companies are investing in staff, but, surprise (not), there is a race to attract people with salaries starting to spiral. As noted above, concerns over sharing sensitive or proprietary information are also top of mind.
All of that raises some serious questions about how companies will adopt LLMs. Will they use public or private LLMs or take a hybrid approach? The reality is that for larger enterprises, and those who want to protect their data, a hybrid approach is likely to be the solution.
Interestingly, Predibase is not the only vendor talking about the need for a hybrid approach. Appian made this a key theme of their user conference earlier this year. Since then, there has been a rush of open-source LLMs and a focus from analyst firms on the need for private solutions.
Of interest here is that protecting data is finally being seen as part of the initial experimentation approach. Companies want to protect sensitive data but don’t know how. Creating one or more corporate LLMs makes sense as data can be ring-fenced and only anonymised data sent to public LLMs.
What is motivating people to adopt LLMs?
The report gives four primary motivations:
- 37% desire to build generative AI capabilities
- 26% want to accelerate the development of AI/ML projects
- 26% want to make it easier to tackle new AI/ML projects that were previously too complex
- 11% are looking to enable a broader set of personas to work with AI/ML.
These are very different from the reasons that individuals give when asked. There are numerous posts on LinkedIn and on other social media sites where people talk using public LLMs to write reports and produce presentations. When that is financial data, there are questions about the legality and compliance of doing this.
For others, it provides a faster and better way of doing things. Recently, in the UK, there was a news report where the owner of a company talked about why he adopted generative AI. It turns out it writes better sales emails than his staff, and it frees them up to follow leads.
But it is not all good news. We are seeing an increase use of LLMs by cybercriminals because the phishing emails are better than those they can write. In addition, once automated, the emails can be customised to each potential victim.
Five use cases and challenges of LLMs
The report lists five top use cases for LLMs.
- Generative AI
- Information Extraction
- Text Classification
- Q&A and Search
- Personalisation/Recommender Systems
Each is explained in full in the report, and it is easy to see how businesses would adopt them. Interestingly, look closely, and you can see why there is a need for public and private LLMs. You can also see how data security can be compromised.
Ironically, some of the use cases are just extensions of what we’ve been trying to do for years, such as the Q&A and Search, as well as the personalisation/recommender examples.
Challenges abound with any technology adoption, and LLMs are no exception. Once again, it is the issue of proprietary data (33%) that tops the list. Of the others, the one that stands out is 17% concerned over LLMs being too costly to train. At present, there are few figures on cost, and with cloud computing access, those costs can be controlled. A better question is not the initial training, but ongoing maintenance costs, especially for private LLMs.
Enterprise Times: What does this mean?
At least once a week, a new research report arrives talking about LLMs. In general, things like data security and costs are commonplace. What is also emerging is an increasing expectation that public LLMs are not the only solution. For companies to get the best out of their data and protect it, hybrid LLMs are going to be the way forward.
It will be interesting to see how the adoption of low-code solutions for LLMs, from vendors such as Predibase, takes off. Companies have seen significant benefits from low-code. Can those same benefits extend to using it to build LLMs?