top of page

When world building and word predicting collide.


Where AI-enhanced accreditation (currently) sits.

A few weeks ago I was presenting ten days of work to a university client – a detailed accreditation methodology built from the ground up. The report was substantial. The analysis was deep. And on the front page, as we always do, there was a disclosure that the deliverable had been produced in part using AI.


The client paused. Looked up, and asked: "So why should I pay you for something I could produce myself using Co-Pilot?"


I paused too.


Not because I didn't have an answer. But because in that moment I was running through what went into that report. Three years of work by my colleagues redefining competencies. Reviews of academic frameworks from across the world. A deliberate decision to reject every existing taxonomy we examined before extending the one that came closest into something that was purpose-built for professional competency assessment. Ten days of rigorous analytical work applying that combination of methodologies to this institution's specific programmes.


None of that was visible on the page. What was visible was a polished, coherent, well-structured document. And to someone who hadn't seen what went into it, that looked like something a prompt could produce.


So I said: "That's an excellent idea. Let's meet when you have your report ready and compare it with what we've produced."


Three days later he came back. “This is excellent work. Far more detail than I had realised."


His epiphany – and I say this without any sense of triumph, because I've had versions of it myself – was that AI makes quality invisible. A well-structured AI-enhanced deliverable and a genuinely rigorous one can look identical on the surface. Fluency is not accuracy. Presentation is not methodology. And the believability of an AI-assisted output is not the same thing as its quality.


Why this matters beyond one meeting

Dr Sam Illingworth, who writes the Slow AI newsletter, has been making a version of this argument for some time. His recent piece, ‘World Models vs Word Models (Slow AI, March 2026)  https://www.linkedin.com/feed/update/urn:li:activity:7439960781405745152/, makes a point that resonated with me: we have built powerful tools that sound equally confident whether they are right or wrong. And most users haven't yet developed the habit of asking what the tool would need to ‘actually understand’ to get something right – and whether the way it works gives it access to that understanding.

That's a discernment question. And it's one most AI users – including, in that moment, a thoughtful and experienced academic – aren't yet routinely asking.


This isn't an argument against AI. It's an argument for knowing what you're looking at.



The debate in the background

There's a lively argument in AI research right now about whether large language models have a fundamental ceiling. Yann LeCun, Turing Award winner, recently raised $1.03 billion for AMI Labs to build what he calls world models, a fundamentally different architecture to the LLMs powering most tools in professional use today (TechCrunch, March 2026). His argument: LLMs predict words rather than model reality, and that's a structural limitation no amount of compute power can fix. Fei-Fei Li World Labs has made a similar bet, raising approximately $1 billion for world model research. Two of the most respected researchers in AI are now publicly backing a different approach to the one underpinning every major product currently on the market.


They may be right. But that debate shouldn't paralyse the professionals using existing tools well today, to solve real problems.

My position is a simple one: let's acknowledge that every technology improves, and what comes next may surpass what's currently available. But let's not forsake ‘good’ – when it's genuinely fit for purpose – in search of perfect, which may not be commercially available for a decade or more.

The question the client was really asking in that meeting wasn't whether AI was involved. It was whether the work was worth paying for. And the answer to that question has nothing to do with the tool. It has everything to do with the methodologies, the expertise, and the judgment that shaped how the tool was used.



What discernment actually looks like

In accreditation work, the question we're always trying to answer isn't "does this curriculum mention this competency?" It's whether the curriculum ‘develops’ it – to the right cognitive level, in a professional context, with evidence of assessment. That's a causal and structural question. It requires expertise to ask and expertise to interpret.

LLMs don't answer that question on their own. But deployed within a methodological approach designed by people who understand what the question requires, they can do significant analytical work in service of it. The tool accelerates the process. Human judgment, expertise, governance and ethical oversight interpret the output and determine the correct conclusion.

That combination is where the value lives. Not in the tool alone. Not in waiting for a tool that can do it without human input. In the deliberate, considered deployment of good technology by people who know its limits.


If you're a professional body leader or an institutional decision-maker thinking about where AI fits in your work, I'd offer one practical test: before you trust an AI-assisted output, ask “what genuine understanding of the task actually requires”. Then test whether the process that produced the output had access to that understanding, or whether it produced something that superficially looks right.


That's the discernment Sam Illingworth is calling for. It's also, as my client discovered, the difference between a tool and what good designers do when the off-the shelf tools don’t fit the problem, also known as: a methodology.



Comments


bottom of page