‘There is no law of computer science that says that AI must remain expensive and must remain large’: IBM CEO Arvind Krishna bangs the drum for smaller AI models
Lightweight, domain-specific models will be the go-to option for enterprises moving forward


IBM CEO Arvind Krishna believes smaller, domain-specific AI models represent a prime opportunity for enterprises to truly capitalize on the technology.
Speaking at IBM’s 2025 Think conference in Boston, Krishna reflected on the course of the generative AI boom so far, noting that the industry has matured significantly compared to last year - as have enterprises and their expectations of the technology.
“As I think about last year to this year, there’s one really big difference that I’m feeling, which is that AI has moved from experimentation to a focus on unlocking business value,” he told attendees.
“People are worrying about what is the use-case? How do I get my business to scale leveraging AI? And I think that's a real big difference, because that means, if I sort of think about hype cycles, that means the hype cycle’s kind of fading, and we are now thinking about adoption, we're thinking about ROI, we're thinking about business value.”
The initial focus on large, cumbersome AI models has waned, Krishna claimed, and is instead shifting towards more finely curated options aimed at tackling specific business challenges.
A key factor in this is businesses adopting AI solutions are dead set on supercharging productivity. While larger models have helped with this, they are often a one-size-fits-all approach.
In contrast, smaller models are now proving vital by enabling enterprises to harness internal data in a more efficient and strategic manner, Krishna noted.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
“I think AI is the source of productivity for this era,” he said. “But not all AI is built the same and not all AI is built for the enterprise.
“Why do I make that claim? 99% of all enterprise data has been untouched by AI. So if you need to unlock the value from that 99% you need you need to take an approach to AI that is tailored for the enterprise.
“If you think about the massive general purpose models. Those are very useful, but they are not going to help you unlock the value from all of the data inside the enterprise.”
Smaller models have a key advantage in that they inherently allow enterprises to derive greater value from enterprise data on a case-by-case basis. Ultimately, this not only helps businesses drive productivity, but also allows them to balance costs and target specific areas where the technology has maximum impact.
“To win, you are going to need to build special purpose models, much smaller, that are tailored for a particular use-case that can ingest the enterprise data and then work,” he explained.
“So you'll think about it, and then you’ll say, ‘well what about accuracy?’ - go look at leaderboards. Smaller models are now more accurate than larger models. What about intellectual property? That is where the open nature of some of the smaller models comes in.”
When assessing specific domains in which to deploy AI solutions, Krishna added that it’s “a lot easier to build a smaller model that can go after that”.
“So as we begin to go forward in that, you can think about these models that are three, eight, 13, 20 billion parameters, as opposed to 300 or 500 and then you begin to think about, well what advantage do I get? They're incredibly accurate. They are much, much faster,” he said.
“They're much more cost effective to run, and you can choose to run them where you want.”
IBM can find its niche with small AI models
It’s in this approach that Krishna believes IBM has a major advantage. The company’s watsonx platform, for example, allows enterprises to build applications designed to tackle specific challenges and areas of the business.
Similarly, Krishna pointed to IBM’s Granite model range, which he said is “a lot cheaper than some of the other alternatives”. These products have become core focus areas for IBM, and it intends to continue investing heavily in both moving forward.
As the industry matures and progresses, Krishna believes this focus on smaller model ranges will have a profound impact, opening up opportunities for enterprises in a trickle-down-style effect.
“As technology comes down the cost curve, it opens up many, many more opportunities,” he told attendees.
“It opens up a huge amount of aperture in terms of the problems you can afford to solve. That’s kind of been the curve it is on. We sometimes capture it in the phrase of, well, that's how we democratize technology, or rather, make it accessible to all, because the cost has come down.
“That is what we are very, very focused on. There is no law of computer science that says that AI must remain expensive and must remain large.”
Small language models are all the rage

Krishna isn’t the first to suggest that small language models will make all the difference for enterprises in the coming year. As organizations weigh up the benefits of the largest models over the lowest cost models that meet their needs and speed up the returns on their AI investments, developers are pivoting to providing lighter models that are performant at the edge.
The likes of Google’s Gemma 3 and Meta’s Llama 4 Scout can be run on a single Nvidia H100 GPU, while in the public cloud OpenAI’s 4o nano is its most cost competitive model for text-only tasks.
By leaning into this burgeoning field with its lightweight Granite model family, IBM could finally carve out a meaningful place for itself in the AI market.
Ensuring that each of its SLMs has a clear, competitive, domain-specific use case will be the key here. While there’s little chance of taking on the AI leaders at their own game – namely general purpose models – there’s clear demand for models that simply do one kind of task very well.
Mixture of experts (MoE) models, an architecture in which an AI model is made up of several smaller sub-models which are ‘experts’ at specific tasks and are only activated when needed, complicates the picture when it comes to how ‘small’ SLMs actually are.
The MoE approach allows models which, on paper, have many billions more parameters than the smallest models available on the market to run at a similar latency and cost per token. Take Llama 4 Scout, for example. Meta’s smallest frontier model has a total of 109 billion parameters, but only 17 billion active parameters at any given time, giving it a more lightweight computational footprint.
As these grow in popularity, developers will have to demonstrate concrete benefits to sticking with their approach or adapt to deliver a hybrid model offering. The jury’s still out on what the best approach or lowest latency is for AI output so there’s all to play for here.
For example, OpenAI boasts that GPT-4o nano takes just five seconds to generate its first output tokens when given an input of up to 128,000 tokens. If IBM could demonstrate a faster response time, or even anything like it available through its open source licensing, enterprises would be able to rely more heavily on AI at the edge.
The more domain-specific IBM gets, the more potential for wins in very niche markets.
MORE FROM ITPRO
- IBM eyes Oracle expertise gains with latest acquisition
- IBM’s CEO just said the quiet part out loud on AI-related job losses
- IBM and SAP expand partnership to drive generative AI capabilities

Ross Kelly is ITPro's News & Analysis Editor, responsible for leading the brand's news output and in-depth reporting on the latest stories from across the business technology landscape. Ross was previously a Staff Writer, during which time he developed a keen interest in cyber security, business leadership, and emerging technologies.
He graduated from Edinburgh Napier University in 2016 with a BA (Hons) in Journalism, and joined ITPro in 2022 after four years working in technology conference research.
For news pitches, you can contact Ross at ross.kelly@futurenet.com, or on Twitter and LinkedIn.
-
Government cybersecurity action plan includes millions in funding
News Cash will go to help start-ups, scale-ups, and university spinouts, while a new advisory group will aim to improve public sector cybersecurity
-
Amazon will "need fewer people" as it rolls out more generative AI
News Amazon's CEO tells workers to be "curious" about AI and educate themselves to protect their livelihoods
-
‘A major step forward’: Keir Starmer’s £187 million tech skills drive welcomed by UK industry
News The ‘TechFirst’ program aims to shore up the UK’s digital skills to meet future AI needs
-
Multiverse wants to train 15,000 new AI apprentices across the UK
News The program, open to workers across the UK, is designed to support the UK government's AI Opportunities Action Plan
-
CEOs think workers are becoming hostile to AI tools, but they’re pushing ahead with adoption anyway
News Executives are driving the adoption of AI tools despite concerns workers will push back
-
Women are three times more likely to lose jobs to AI – here are the roles facing the biggest threats
News Roles dominated by women are three times more likely to be replaced or transformed by AI than those traditionally held by men
-
Salesforce splashes the cash with $8 billion deal for Informatica
News Informatica will help bolster Salesforce’s data management and governance capabilities
-
Millennials are leading the charge on AI skills development
News Workday research suggests mid-career workers are largely on board with upskilling to take advantage of AI
-
‘The AI boom has fueled a wave of overfunded startups that look healthy on the surface’: Cash is flowing into AI startups, but investors have warned of the rise of ‘zombiecorns’ as companies struggle with revenue growth
News Investors are growing concerned about the rise of 'zombiecorns' in the AI startup space
-
A decade-long ban on AI laws is a “terrible idea” for everyone but big tech, critics claim
News A proposed decade-long ban on US states implementing AI laws is a "terrible idea" that highlights the scale of big tech lobbying, according to critics.