IBM, Dell pros consider on-prem options for generative AI

If your enterprise is trying out a lot of AI, be ready to rethink IT, Dell’s CTO said.

article cover — *Sopa Images/Getty Images*

April 12, 2024

• 4 min read

In the AI age, IT pros have a familiar question to consider: On-prem or not on-prem?

According to a recent panel discussion at Bloomberg Headquarters, IT practitioners must take inventory of a company’s AI efforts and weigh the benefits of hosting the services on location or off in the cloud.

“The thing you should be asking yourself on a technical level is at what point is the majority of the IT capacity of your enterprise in service of AI-driven technology versus traditional workloads? And if the answer is greater than 50%, you need to rethink your entire IT architecture and start building to optimize for AI first,” John Roese, global chief technology officer, Dell Technologies, told a crowd during Bloomberg Intelligence’s April 4 event, Generative AI: Next steps and evolution.

The evolution: Generative AI is poised to produce $1.3 trillion in revenue, according to Bloomberg Intelligence, and generative AI “is poised to expand its impact from less than 1% of total IT hardware, software services, ad spending, and gaming market spending to 10% by 2032.”

Early, controversial uses of the tech already include generating summaries, code, personalized recommendations, art, and language translation.

IT leaders have a few options for powering their gen-AI efforts:

All cloud. Some companies will turn to the public cloud to deploy generative AI—great news for hyperscalers like Meta, Microsoft, and Amazon, which provide the heavy resources required for resource-intensive processes like the training of large language models (LLMs). “If you wanted to fine-tune a model and do training of your own, honestly, I would recommend you do that probably in a hyperscaler or somewhere that gives you access to lots of flexible infrastructure, because that's a one-off as opposed to a steady state,” Roese said during the panel.
All on the ground. An on-premise generative AI approach deploys the applications within an org’s own data centers and infrastructure, allowing control over sensitive data. “It turns out that if you’re running something full speed forever, it is very efficient to do it on owned infrastructure in your own environment under your own control,” Roese told attendees.
A little bit of both. Many enterprises will have a hybrid approach: distributed systems, multiple clouds, multiple LLMs. “What we’re seeing as a trend is almost everyone is picking a major hyperscaler, a backup one, and on-prem,” Dr. Ruchir Puri, chief scientist, IBM Research, told the crowd.

A study from Red Hat Software (an IBM subsidiary), published in March 2023, polled 300 enterprise and IT pros from the Gartner Peer Community (a professional network) about their plans for AI and machine learning (ML). The study found that 66% of respondents had deployed their organization’s AI/ML projects via a hybrid cloud (4% said they were deploying them on-premises; 30% said they were deploying fully on the cloud).

IT pros, including Roese, recently spoke with IT Brew about the potential cost and accessibility benefits of custom AI services on owned or rented hardware.

“You don’t want to be on the receiving end of having a public cloud environment that charges you per transaction, and then you build a wildly successful GPT to serve your customers,” Roese told IT Brew.

In the industry panel at Bloomberg HQ, Roese noted that orgs are now “doing the math” to optimize their usage of AI, including the cost and security benefits of on-prem management.

“If you can control the most important data in that distributed system, the overall risk from a privacy perspective and a business risk perspective goes down dramatically,” Roese told the audience.

Big, not best? In a survey conducted in April 2023 by the software company expert.AI, 37.1% of respondents said they are already considering building enterprise-specific language models—ones smaller than giant public generators trained on the entire internet.

“What we really are seeing is small- to medium-sized models are the ones that are getting deployed the most, because they are cost efficient, efficient to deploy,” Puri said.

Both panelists appreciate the search engine-like power of massive large language models, but don’t see the vast repository of answers helping the enterprise.

“These smaller large language models, the domain-specific ones are actually extremely good and they sound like human beings. They are very good at interaction. They’re even multimodal in many cases. What they don’t have inside of them is the entire internet’s worth of information, but I don’t really need them to write poetry in Urdu,” Roese said.

Top insights for IT pros

From cybersecurity and big data to cloud computing, IT Brew covers the latest trends shaping business tech in our 4x weekly newsletter, virtual events with industry experts, and digital guides.