AI Reasoning Is Cool, But First How Can We Tackle Organisational Debt?

The excitement around what AI will mean for organisations and the new world of work continues its exponential growth, but it seems AI readiness and adaptation within organisations is still on a shallower linear growth path.

I hesitate to widen this gap by focusing too much on the shiny imagined futures that may or may not come to pass (at the expense of the hard, thankless work of preparing our organisations to make the most of AI), but it is interesting to see where leading investors think this is all headed.

Vinod Khosla is more than bullish, and sees a straight line from ‘is theoretically possible’ to ‘will happen’ in terms of how AI will impact work.

I estimate that over the next 25 years, AI can perform 80% of the work in 80% of all jobs—whether doctors, salespeople, engineers, or farm workers. Mostly, AI will do the job better and more consistently. Anywhere that expertise is tied to human outcomes, AI can and will outperform humans, and at near-free prices.

I admire his optimism, and I hope he is right. But this part of his thesis hinges on what we regard as ‘work’. A huge amount of the dumb process work people do today could be automated, for sure, but then the definition of work will shift to higher value activities. And let’s not forget that so many automatable processes in agriculture, manufacturing and other sectors are still performed manually today because in many areas of the world people are seen as cheaper than technology.

Can Enterprise AI do strategic thinking?

Looking beyond the automation of repeatable process work, algorithmic calculations and the things computers are already good at, will we really see AI encroach on higher order tasks such as strategy formation, knowledge work, thinking and reasoning?

Another investor, Sequoia, certainly believes so and suggests we are seeing the emergence of a new frontier for AI reasoning:

This leap from pre-trained instinctual responses (”System 1”) to deeper, deliberate reasoning (“System 2”) is the next frontier for AI. It’s not enough for models to simply know things—they need to pause, evaluate and reason through decisions in real time. Think of pre-training as the System 1 layer. Whether a model is pre-trained on millions of moves in Go (AlphaGo) or petabytes of internet-scale text (LLMs), its job is to mimic patterns—whether that’s human gameplay or language. But mimicry, as powerful as it is, isn’t true reasoning. It can’t properly think its way through complex novel situations, especially those out of sample. This is where System 2 thinking comes in, and it’s the focus of the latest wave of AI research. When a model “stops to think,” it isn’t just generating learned patterns or spitting out predictions based on past data. It’s generating a range of possibilities, considering potential outcomes and making a decision based on reasoning.

There has also been a flurry of recent attempts to test the ability of LLMs to reason or to think strategically, which is what we imagine CEOs spend their time doing, to see if they could play a role higher up the value chain.

In an experiment to see if AI can outperform CEOs, academics, Strategize worked with academics from the University of Cambridge to simulate strategy formation as a kind of turn-based process, and were impressed by the results:

GPT-4o’s performance as a CEO was remarkable. The LLM consistently outperformed top human participants on nearly every metric. It designed products with surgical precision, maximizing appeal while maintaining tight cost controls. It responded well to market signals, keeping its non-generative AI competitors on edge, and built momentum so strong that it surpassed the best-performing student’s market share and profitability three rounds ahead.

However, there was a critical flaw: GPT-4o was fired faster by the virtual board than the students who played the game.

Why? The AI struggled with black swan events — such as market collapses during the Covid-19 pandemic.

Other research papers have recently found similar potential for LLMs to perform well in strategic decision making, such as this one from the National Bureau of Economic Research:

We propose that LLMs can be used to create ecologies of AI agents that interact in environments mimicking the strategic interdependencies and complexities of real-world settings. Without the need to mechanistically specify their behavior, researchers can endow these autonomous agents with objectives, preferences, capabilities, and personalities of their choosing. The end result is a fast, robust, and flexible method to generate theoretical predictions under different assumptions. We refer to this framework as Generative AI-Based Experimentation (GABE). When compared with other types of simulations, we argue that GABE has one crucial difference: it relies on AI agents whose behavior is not deterministically pre-specified by the researcher, yet whose reasoning can still be elicited via direct prompting. As we aim to demonstrate, learning the mechanisms behind their actions in this way and tweaking the characteristics of AI agents is a powerful way to build, validate, and extend management theory.

Whether this new affordance is used as a form of decision support assistant or a replacement for management decision-making remains to be seen, but I suspect it will be the former to keep a human in the loop on most important issues.

Overcoming org debt and artificial stupidity

Building on our case study last week on digital twins, these findings might suggest that large organisations in complex markets might benefit from using a ‘strategy twin’ to uncover and evaluate strategic options.

Of course, all of this rests on an organisation’s ability to deploy agentic AI, which we have written about quite a bit in recent months, and also harness its knowledge graph. These are not insurmountable challenges by any means, and we are seeing signs of progress in both areas – e.g. Atlassian just announced its Rovo AI search tool, announced 6 months ago, is now generally available for customers who have so much of their corporate knowledge stored on Confluence wikis.

But there are other problems, such as organisational debt and the proliferation of dumb processes that will also hinder progress towards creating the intelligent enterprise unless we start to address them.

Writing in Time magazine earlier this year, Rodney Evans discussed this issue and how it affects the ability of our teams to grow and become more productive:

Processes like this are rife with organizational debt—the sand in the gears that employees and customers have to contend with when a company’s ways of working become outdated and calcified. A simpler way to describe it is, simply, waste. Some call it sludge or bureaucracy mass; others experience it as death by a thousand paper cuts. By one estimate, it costs OECD economies $9 trillion annually in lost economic output.

Org debt exists whenever you find yourself spending more time navigating a process or set of rules than engaged with the issue itself.

The buzzword of the moment is “efficiency.” Companies are hiring leaders to trim fat; kicking off programs with high-cost management consultants under banners of “Strategic Cost Transformation,”; and restructuring, laying off, and increasing prices in hopes of recapturing the growth experienced during a once-in-a-lifetime crisis of the Covid pandemic.

There’s just one problem: You don’t get long-term efficiency by slashing dollars and headcount. Those are crash diets—the most obvious, least-nuanced plays. Sustainable efficiency doesn’t come from a juice cleanse, but from changing the habits of how an organization moves.

Before we can exploit artificial intelligence in the enterprise, a good starting point would be to free people from this sludge and ask why our organisations tend to make people collectively incompetent in the workplace, as Bertrand Duperrin discussed this week as part of a conference debate titled Do companies make people incompetent?

Addressing this question will also require a shift in the focus of HR as a function – from policing and process compliance to support and advocate better employee experience, as Ashley Goodall wrote about last week in MIT Sloan. We probably also need to separate management roles that are really about situational leadership, coaching and making things happen from those that are really just about water carrying, process and performance management and politics, as this latter category is an unnecessary cost rather than a force multiplier.

Start at the team level

In the organisational transformation projects we have been part of over the years, we have always said the goal is not to ‘change’ anybody, least of all to change everybody – better to over-invest in the willing ones to create examples of a better way of doing things that might spread.

Rather than work to dismantle the kind of bureaucratic scaffolding that Rodney Evans refers to above, it is better to find simpler solutions for the coordination of work that don’t rely on that structure, in the hope that is becomes less relevant and falls away of its own accord.

There is simply too much ‘cruft’ and too many process weeds that have grown around the branches of the formal org chart over time to be able to cut through even with an AI-powered tool. Better to just do things differently and hope it withers on the vine.

One of the things I like most about working in wikis is that you start with no structure, just a blank page, and you gradually create the minimum structure you need to connect the content as it grows, compared to the old model of file folders that you create ahead of time and then fill with content. It would be great to see the same philosophy applied to coordinating the work of teams within an organisation – what is the minimum viable bureaucracy needed to connect and combine their work?

We have written previously about the importance of wrapping lightweight AI agents around teams that are encouraged to operate in a self-managed and largely autonomous, service-oriented configuration, and this is still our best advice on where to begin:

This week, I also shared a video talk on how to get started with Centaur teams as part of the Digital Leaders Week event, which contains 28 minutes of reflections and advice on supercharging service teams with supportive AI, and what this might lead to in the future.

If you would like to learn more about our work supporting teams in adopting agile, autonomous, AI-enhanced ways of working, please get in touch for more information and up-to-date project stories.