Rajesh Jain

Thinks 1802

Forbes: “Foody, Hiremath and Midha founded Mercor in 2023, originally with the mission of matching engineers in India with U.S. companies in need of freelance coders. They built a recruiting platform that allowed applicants to interview with AI avatars and matched them to companies in need of talent. In the process, they stumbled into the in-demand world of data labeling, pairing expert-level contractors, like Ph.Ds and lawyers, with frontier labs like OpenAI. All three appeared on Forbes 2025 Under 30 list. In September, shortly after Mercor debuted on the Forbes Cloud 100 list of top private cloud computing companies, Foody announced the company had hit $500 million in annualized revenue run rate, up from $100 million in March.” WSJ: “Co-founded in 2023 by three college dropouts, Mercor manages 30,000 contractors around the world who label images, write sentences, and provide expert feedback to help AI chatbots learn how to think and speak like humans.”

SaaStr: “The 3x-5x Revenue Multiple is the New Normal If You Aren’t A Break-Out Leader. Forget about 10x-15x revenue multiples unless you’re growing 40%+ and profitable. The median public SaaS company valuation multiple as of early 2025 stands at 7.0x current run-rate revenue. For private B2B SaaS companies, bootstrapped companies yield a predicted valuation multiple of 4.8x while equity-backed companies yield 5.3x.”

WSJ: “Facing so much uncertainty, Apple tumbled to a market capitalization of $2.6 trillion in April and lost its title as the world’s most valuable company. Six months later, Cook pushed Apple’s market value above $4 trillion for the first time. That’s more than 10 times what the company was worth when Cook took over as chief executive from Steve Jobs 14 years ago.During Cook’s years at the helm, Apple hasn’t unveiled a revolutionary technology or introduced a new product that will reshape people’s lives the way the iPhone did. Instead, Cook, [who turned 65 recently], has won over shareholders by doing just enough to protect and grow the business, a conservative strategy that has been on display this year with clever political and legal maneuvering and enticing new iPhones.”

FT: “Shopping was once a bricks-and-mortar experience. Then came the web, smartphones and social media. Now it is coming to a chatbot near you. For retailers, that’s both good and bad. Agentic commerce, or artificial intelligence-powered shopping, is the hot new trend…Agentic AI also has downsides for retailers. For one thing, shops risk missing out on spur-of-the-moment, add-on purchases, when shoppers who visit a website for one thing end up buying other items as well. Most seriously, retailers are giving away valuable data by allowing transactions to take place in AI apps instead of their own websites. That’s a problem, given they have been using sought-after insights on consumers’ shopping habits to build up their own digital advertising businesses.”

World Models: An Overview and Marketing’s Future (Part 6)

Applications to Marketing

This essay builds on my past writings and inputs from AIs (Claude and ChatGPT).

World models are rapidly emerging as the next frontier of artificial intelligence. Unlike large language models, which predict the next word in a sequence, world models predict what will happen next in the world—they build internal simulations of cause and effect. They imagine, forecast, and plan. If LLMs are powerful librarians of human knowledge, world models are flight simulators for decision-making. And when viewed through the lens of marketing, their impact could be nothing short of transformative.

For decades, marketing has been built on pattern recognition. We segment customers into groups, examine historical patterns, and optimise campaigns based on what has worked before. At its core, marketing has been fundamentally backward-looking: the past determines the next best action. World models flip this on its head. They move marketing from remembering to imagining—from projecting history to simulating futures.

From Customer Segmentation to Individual Simulation

Today, personalisation is rooted in similarity: “customers who bought X also buy Y.” With world models, similarity becomes irrelevant. Instead of predicting behaviour because other customers behaved that way, the AI models the internal dynamics of each specific customer and simulates how that individual will behave in different conditions. This is methodological individualism applied to marketing—understanding each customer as a unique decision-maker, not as a member of a segment. It is no longer “people like her”—it is her.

From A/B Testing to Counterfactual Experimentation

Instead of running experiments on customers, brands can run experiments inside the twin. A marketing team can test hundreds of interventions—timing, cadence, offer structure, creative style, channel mix—safely in simulation. The winning strategy reaches the customer only after the twin has proven it. This dramatically reduces AdWaste and profoundly raises confidence in execution. You’re not spending to learn; you’re learning to spend.

From Pattern Matching to Causal Persuasion

Current marketing predicts that a runner will click on trainers because other runners did so before. World models understand why she runs—whether for stress relief, social belonging, or competition—and generate interventions that align with her motivation. Timing becomes as important as messaging; triggers become as critical as offers; subtle nudges outperform overt pitches. The power lies not in showing the right product but in activating the underlying decision driver.

From Owned Channels to Adaptive Environments

Emails, apps, and websites become dynamic worlds that adjust themselves based on moment-to-moment interaction. A NeoMail is no longer a static message—it becomes a live space that responds to each user’s micro-behaviour, shaping content, rewards, and recommendations automatically. This shift turns the revenue tax paid to ad platforms into investment in owned media that learns and improves. A channel becomes a continuously adjusting environment rather than a delivery mechanism.

From Designed Journeys to AI-Orchestrated Futures

Today’s customer journeys are flowcharts—courses built by humans and refined through hindsight. Tomorrow’s journeys will be discovered, shaped, and adjusted by the model itself in real-time. Every customer receives the journey most likely to maximise lifetime value—not because it was built manually, but because the model has simulated thousands of futures and chosen the best one. Marketing becomes proactive navigation, not reactive optimisation.

From Cost Centre to Profit Engine Through Predictive ROI

World models change the role of marketing measurement. Dashboards today describe the past. World models allow marketers to preview the future—predicting churn before it happens, forecasting which Rest customers can be recovered economically, and projecting the impact of expenditure on attention and retention. This makes outcome-based pricing models like Progency feasible: you can guarantee results because you’ve already seen them work in simulation. Marketing becomes a profit engine with measurable, predictable returns, not a reporting function justifying past spend.

The Core Transformation

In short: LLMs answer questions; world models answer consequences. And that single shift moves marketing from reactive communication to proactive orchestration of customer behaviour. When combined with innovations like AI Twins, NeoMails, Co-Marketer agents, and the war on AdWaste, world models may herald the most consequential marketing transformation in decades—the shift from marketing to memories to marketing to imagined futures.

The question is no longer “what worked before?” but “what will work next?” That’s the difference between looking in the rear-view mirror and having a windscreen that shows you the road ahead before you drive it.

I asked the AIs to bring this future to life via a story about Ria (customer) and Maya (marketer).

Thinks 1801

Bloomberg: “The most important question that companies face in deploying Artificial Intelligence (AI) is not technological but organizational: Should they use AI to increase the power of high-up managers or liberate front-line workers? I suspect that the bulk of them will give the wrong answer to the question — and that we will be dealing with the consequences of their mistakes for decades to come, not just economically, as companies lose their creative flair, but also politically, as professional elites join the ranks of the angry and alienated. Companies will evolve in radically different directions according to the answer that they give to this question. Choose the first answer and they will evolve into “panopticons.””

WSJ: “Organizations perform best when their strategy, or what they aim to achieve, aligns with execution, which is the process of putting that plan into action. It’s called “coherence.”…I believe, though, that a solution to this coherence gap is in sight—thanks to AI. Call it a “coherence prompt.” Trained on an organization’s purpose and strategy, this AI tool wouldn’t make decisions for leaders, but it would act as a kind of strategic conscience, flagging drift, pointing out disconnects and helping people think about their organization’s strategy at every level. It would ask and help assess every project with four simple questions: Does this project or investment fit the strategy we’ve committed to—and the value we’ve promised to deliver? Does it build on the capabilities we’ve said we want to lead with? Are we focusing our time and investment where it matters most? Does this reflect our culture—and what we’ve learned from experience?”

Mint: “Rapid urban growth is exposing cracks in India’s city governance. Without stronger local governments, institutions suited to city scale and better planning, millions risk living in poorly managed urban spaces.”

WSJ: “Forget teaching robots to think like humans. A field called swarm robotics is taking inspiration from ants, bees and even slime molds—simple creatures that achieve remarkable feats through collective intelligence. Unlike traditional robots that take orders from a central computer, swarm robots work like ant colonies. No single robot is in charge, but the swarm accomplishes complex tasks through simple interactions between neighbors. Each robot interacts only with those nearby, sometimes communicating with sounds or chemical signals in particles they release. Researchers say this approach could excel where traditional robots fail, like situations where central control is impractical or impossible due to distance, scale or communication barriers.”

World Models: An Overview and Marketing’s Future (Part 5)

Digital Twins

I next asked Claude to discuss Digital Twins in the context of World Models.

Digital twins have been around since NASA used simulators to troubleshoot Apollo 13 in 1970, though the term wasn’t coined until 1997. By the 2000s, manufacturers were creating virtual replicas of jet engines and factory floors to monitor performance and predict maintenance needs. Today, roughly 75% of businesses employ digital twins in some capacity, from monitoring turbines to simulating entire cities.

But here’s what most digital twins are today: sophisticated data visualization systems. They mirror current states, track historical performance, and run rule-based simulations. A digital twin of a wind turbine displays real-time performance data and can simulate the impact of changing wind speeds based on pre-programmed physics equations. Useful, certainly. Transformative? Not quite.

The Static Twin Problem

Current digital twins have a fundamental limitation—they’re built on static models. Whether it’s a Digital Twin of an Organization mapping business processes or a Digital Twin of a Customer representing buyer behaviors, these systems operate on fixed rules and historical patterns. They can tell you what happened and predict outcomes based on what has happened before, but they struggle with genuine novelty.

A customer digital twin today might predict that Sarah will buy running shoes because she bought running shoes before. It’s pattern matching wrapped in fancy visualization. The twin doesn’t understand why Sarah runs, what motivates her purchases, or how her behavior might evolve as her life circumstances change. It’s a database with a better interface.

Enter World Models

World models change the equation fundamentally. Instead of rule-based simulations, they learn the underlying dynamics of environments. They don’t just store facts about how customers behave—they develop an internal understanding of what drives that behavior, similar to how humans build mental models of cause and effect.

Imagine a customer digital twin powered by world models. Rather than simply mirroring past purchases and predicting repeat behavior, it would understand the customer’s decision-making process. It could simulate how that customer would respond to different marketing interventions, not because similar customers responded that way in the past, but because it has learned how this customer actually thinks and decides.

The twin could run counterfactual scenarios: “If we change the email cadence, how does Sarah’s engagement evolve over three months?” “If we introduce this new product category, which existing customers will find it relevant based on their underlying preferences, not just their purchase history?”

The Convergence

The convergence of world models and digital twins represents a shift from backward-looking analytics to forward-looking simulation. Digital twins have always promised the ability to “test before you deploy”—but with static models, you’re testing against the past, not the future.

World models bring three critical capabilities to digital twins:

Dynamic understanding rather than static rules. The twin doesn’t just know that customers like personalized emails—it understands how personalization affects decision-making and can predict when it becomes intrusive.
Genuine prediction rather than pattern matching. The difference between “customers like Sarah usually do X” and “Sarah will likely do X because of how she perceives value and makes decisions.”
Causal reasoning rather than correlation. World model-powered twins can distinguish between what drives behavior and what merely correlates with it, enabling true experimentation and optimization.

What This Means for Marketing

For marketers, this convergence opens a new frontier. Customer journey optimization today means analyzing past journeys and tweaking touchpoints. With world model-powered digital twins, you could simulate entire journeys before customers take them, testing not just messaging variations but strategic interventions—timing, sequencing, channel choices, offer structures.

The digital twin becomes less like a dashboard and more like a flight simulator—a place to test piloting strategies in realistic conditions before committing resources. You’re not just looking at customer data differently; you’re creating an environment where you can experiment with customer dynamics.

The question is no longer “what did customers do?” but “what would customers do if we changed the system?” That’s the shift from digital twins as mirrors to digital twins as laboratories.

My previous essays on Digital/AI Twins and Marketing:
Marketing’s AI Triad: Large Customer Model, Digital Twins, Co-Marketer
AI-Powered Digital Twins: Marketing’s Marvel
AI Twins: Digital Customer Representations That Will Transform Marketing
Digital Twins in Marketing: Magical Minions
AI Twins: The Future of Marketing Intelligence
AI Twins in Action: Daily Allies for Smarter Marketing and Meaningful Connections

Thinks 1800

WSJ: “When students turn to artificial intelligence to do assignments, it’s usually to do the work easier, faster and, to the chagrin of educators, with less brain power. Not so at Northwestern University, where AI is revolutionizing the cornerstone of American business education—the case study. At Northwestern’s Kellogg School of Management, AI is turning that method on its head. Students no longer read through every available factoid on, say, Walmart’s wages for hourly workers and write a memo—tasks that can be easily circumvented with generative AI tools. Instead, M.B.A.s must draw out details and data through open-ended conversations with AI chatbots and then craft a strategy. Students who’ve worked through the AI-guided case—this one involves helping a school district erase a $50 million deficit fueled by transportation costs—say it more closely resembles what consultants and business leaders do in the real world: You’re not handed information. You must determine what you need to know from a cast of AI-created school officials and employees, and grill it out of them.”

FT: ““It’s still deeply concerning how concentrated this market is becoming,” says Sonnenburg, who focuses on tech stocks for California investment group Irving Investors. “If you’re not in one of about 10 names, it’d be insanely challenging to make money.”…Eight of the 10 biggest stocks in the S&P 500 are tech stocks. Those eight companies account for 36 per cent of the entire US market’s value, 60 per cent of the gains in the index since the market bottomed in April and almost 80 per cent of the S&P 500’s net income growth in the last year.”

Economist: “The extent to which AI truly eliminates rip-off markets depends on two things. First, consumers need to know how to use AI properly. Mindlessly repeating advice from ChatGPT is less effective than using the bot as a learning tool that allows a consumer to negotiate more credibly…Second, providers and retailers are likely to fight back with their own AI tools. Amazon listings are already swamped with AI-generated product descriptions…What seems clear is that the days of the know-nothing consumer are well and truly over.”

Advice from The Generalist: “If you have one truly good idea in your life, that is more than enough…To improve your mind, read fiction. We spend the vast majority of our professional lives in pure knowledge-gathering mode; there are different, deeper truths hidden in stories…“Never quit” is terrible advice. Try lots of things, quit lots of things. Your time is not infinite. Sometimes you have to give up an old dream for a better one to emerge…It’s more impactful to know your most talented contemporaries than your heroes. Meeting a hero is unlikely to change the course of your life. But if you meet the most talented minds of your generation early, you can spend the next fifty years collaborating with them as they scale.”

World Models: An Overview and Marketing’s Future (Part 4)

Comparison

I asked Claude to compare LLMs and World Models.

The AI landscape is witnessing a fascinating debate: are Large Language Models the path forward, or do we need something fundamentally different? To understand this, we need to grasp what each approach actually does—and more importantly, what it cannot do.

What LLMs Actually Do

Large Language Models are trained to predict the next token in sequences of text, learning statistical patterns across billions of words. Think of them as extraordinarily sophisticated pattern-matchers. When you ask ChatGPT a question, it’s not “thinking” in any conventional sense—it’s generating the most probable continuation of your prompt based on patterns it absorbed during training.

This gives LLMs surprising proficiency in diverse tasks like translation, summarization, and dialogue generation. They can write code, explain complex concepts, and even seem to reason. But here’s the catch: their understanding of phenomena like gravity or thermodynamics is inferential, derived solely from linguistic patterns rather than grounded in physical reality.

The hotly contested question is whether LLMs develop genuine “world models” internally. Some research suggests language models can develop rudimentary world models for simple concepts in their internal representations, like tracking game states in Othello. But prominent researchers argue that instead of coherent world models, today’s generative AIs learn “bags of heuristics”—scores of disconnected rules that don’t form a consistent whole.

How World Models Work Differently

World models are neural networks that understand the dynamics of the real world, including physics and spatial properties. Rather than predicting the next word, they predict what will happen next in the physical world, modeling how things move, collide, fall, interact and persist over time.

World models take inspiration from mental models humans develop naturally—our brains form abstract representations from our senses into concrete understanding of the world around us. A baseball batter has milliseconds to swing—shorter than the time for visual signals to reach the brain. The reason we can hit a fastball is our ability to instinctively predict when and where the ball will go based on our internal world model.

These models consist of multimodal input processing, temporal prediction that forecasts future frames from historical data, and latent representation learning that operates on abstract state spaces. This allows AI agents to “imagine” different scenarios before acting, planning by simulating future state sequences for different strategies.

The Fundamental Difference

The key distinction lies in grounding. LLMs derive their power from statistical patterns in textual data, while world models seek to emulate causality and the laws of physics through learned simulations. An LLM “knows” about gravity because countless texts mention it. A world model understands gravity because it has observed thousands of hours of objects falling and learned the underlying physics.

World models trained on photos, audio, videos, and text create internal representations of how the world works and can reason about the consequences of actions. This matters enormously for applications like robotics, autonomous vehicles, and any AI that must act in physical space rather than just generate text.

What This Means Going Forward

Yann LeCun has argued that within three to five years, world models rather than LLMs will dominate AI architectures. Whether that timeline holds, one thing is clear: text prediction alone has limits. Acting reliably in the physical world requires more than next-token prediction—it requires maintaining state, predicting dynamics, and supporting counterfactual reasoning for planning and control.

We’re likely heading toward hybrid systems that combine both approaches—language understanding from LLMs with physical reasoning from world models. The question for marketers is: which capabilities matter for your applications?

Thinks 1799

FT: ““Like jumping from a bridge.” That was how one employee at French telecoms operator Orange described a skills programme that has helped old-school engineers to master cloud-based technologies, while managers retrain in roles such as cyber security. Thankfully, Orange does not just wait for people to sink or swim; it gives them a safety net. “When we offer a worker the chance to change jobs, it’s not six months in the classroom,” says Vincent Lecerf, executive vice-president of human resources and group transformation at Orange. Instead, the individual joins a new team — on their existing salary and the promise of continued employment — in an approach that, from day one, interleaves on-the-job learning with formal instruction and coaching. “What makes the difference is not just the quality of the training but of the mentor — the feeling that as a learner you’re supported,” Lecerf adds.”

Ads as Cues: “Why do we see both advertising and powerful consumer habits for well-known and intrinsically similar brands? We offer an explanation based on the idea that, as in Bordalo et al. (2020), a consumer is more likely to demand a good if she recalls the pleasure it gave her in the past. In turn, the consumer is more likely to recall goods that are consumed more frequently and more similar to cues, subject to interference from other goods. Our model yields context-dependent brand habits where ads work as memory cues. It predicts that ads: i) are more effective for more habitual consumers and ii) exhibit spillovers, within and across products, that are stronger for more habitual consumers and for goods with more similar ads. Using data from NielsenIQ and Nielsen we find support for these predictions in 20 undifferentiated and highly advertised product categories. Memory offers new insights on how advertising affects market competition and consumer welfare.”

The Deseret 50: The 50 people, organizations and ideas changing philanthropy.

WSJ: “Criminals increasingly use generative AI to mimic real people’s voices and con their loved ones out of money. If you receive a call from someone who sounds just like your grandson and says he needs money or a gift card, the best thing to do is hang up and call your grandson. But if the voice is so convincing that you can’t bear to do that, ask for your family code word. If the caller can’t produce it, hang up. Likewise, if you are legitimately in trouble and need to call a loved one for help, say the code word so your relative knows it’s really you. People who haven’t implemented a family code word have been burned by bad actors.”

World Models: An Overview and Marketing’s Future (Part 3)

Review – 3

ChatGPT:

In artificial intelligence, world models refer to an AI system’s internal representation of how the world works – essentially a simulated reality inside the machine’s mind. A world model enables an AI agent to predict future states of its environment and imagine potential scenarios without directly experiencing them. This concept has roots in cognitive science: humans are thought to carry a mental model of the world, allowing us to anticipate outcomes and plan our actions. As system dynamics pioneer Jay Forrester described, “The image of the world around us, which we carry in our head, is just a model… [one] has only selected concepts, and relationships between them, and uses those to represent the real system.”. In other words, we do not mentally simulate every detail of reality, but we hold an abstract model that captures the essential structure of our world. In AI terms, a world model is typically a generative model of the agent’s environment. Formally, researchers define a world model as a system that, given the current state of the world and an action taken by the agent, can predict the next state. For example, if the current state includes the position of objects and the agent’s status, and the action is “move forward,” the world model should predict what the new positions and statuses will be after that action.

A world model is all about learning an internal model of the world’s dynamics so that an AI can understand, predict, and plan within its environment. This involves learning abstract representations (latents), using self-supervised prediction as a training signal, capturing causal relationships over time, and leveraging the model for imagination and planning. The end goal is an AI that doesn’t just react reflexively, but “knows” how the world works and can reason about what to do – much like humans who foresee outcomes using mental models.

Consider this vivid analogy from a recent essay: An LLM is like a brilliant autocomplete that can imitate understanding by stitching together patterns in language, but it has no “imagination” beyond what it’s seen in text. A world model, by design, imagines – it simulates situations that perhaps never occurred exactly the same in its training data. It’s the difference between reading about the world versus experiencing a simulated mini-world. One paper framed it elegantly: “LLM simulates the next word in human language… [A world model] simulates the next world.” [Source]. Each has its strengths – LLMs leverage vast knowledge encoded in human text, whereas world models provide an internal sandbox for truly understanding and navigating an environment. The consensus emerging is that next-generation AI systems likely need both: the knowledge and linguistic capabilities of LLMs plus the grounded predictive planning of world models. Indeed, researchers have begun experimenting with combining the two, for example by training language models with objectives that encourage predicting missing text in a way analogous to JEPA (to imbue them with a bit of world-model-like consistency) [Source].

Claude:

Kenneth Craik proposed in 1943 that minds construct “small-scale models” to anticipate events before they occur. Philip Johnson-Laird’s mental models theory (1983) suggests we reason by mentally simulating scenarios rather than applying formal logic. The predictive processing framework in neuroscience argues the brain constantly predicts sensory input, updating beliefs only when predictions fail.

World Models translate these cognitive principles into AI architecture. They comprise nine interconnected components: latent spaces that compress high-dimensional observations into 32-64 dimensional representations filtering out unpredictable noise; predictive coding that forecasts future latent states from current state plus action; model-based reasoning enabling deliberative Mode-2 thinking versus reactive Mode-1 responses; causal structure encoding cause-effect relationships; simulation capabilities for forward prediction and counterfactual “what if” analysis; memory systems maintaining both short-term hidden states and long-term associative knowledge; planning mechanisms using gradient-based optimization or search-based methods like Monte Carlo Tree Search; self-supervision through contrastive or non-contrastive learning from unlabeled data; and embodied agent capabilities closing the sensorimotor loop from perception through planning to action.

This architecture provides what LLMs fundamentally lack: sample efficiency through imagination rather than trial-and-error (humans learn to drive in 20 hours versus millions of RL episodes for pure model-free learning), common sense grounded in intuitive physics about gravity and momentum, planning and reasoning through forward simulation of consequences, generalization by capturing underlying dynamics that transfer across tasks, safety through virtual testing of dangerous scenarios, and computational efficiency by predicting abstractions not pixels. [Source]

Thinks 1798

Sanjeev Prasad: “People have been giving a higher multiple to India for three reasons: long-term growth prospects, good quality companies and being relatively better versus other countries. But I am seeing a reversal of those arguments now. I met a host of global investors over the past four months, and for the first time, I heard concerns around India’s long-term growth prospects, which nobody ever asked me before. The questions global investors raised were about job growth and (lack of) R&D and innovation by Indian companies. Why are there no world-beating Indian tech companies? Indian companies, to my mind, may have become comfortable, given their belief in the country’s long-term growth prospects. They believe ‘we can continue to make a decent amount of money without taking some additional risk,’ if I may put it that way. However, making money easily is becoming more difficult for the simple reason, if you look at any sector in India, you’re starting to see more competition and disruption coming in. So this fundamental premise, which both companies and investors are working with that we have growth in the longer run and (there is) no change in the competitive dynamics, business model, etc., is facing a challenge because of technological changes.”

Fernando Nikolić: “Everyone under 30 is prematurely old (worried about savings, career, FIRE). Everyone over 50 is desperately young (Burning Man, psychedelics). My theory: Information abundance aged the young by showing them all future problems all at once. Information abundance also made the old young by showing them all missed experiences all at once. So now Gen Z talks like retirement planners and boomers act like teenagers. It’s so over.”

WSJ: “Here in the future, we call them ultralight eVTOLs (electric vertical takeoff and landing vehicles). Of course, they don’t much resemble the levitating Studebakers and auto-gyrating Chevys foretold in pulp science fiction. The Pivotal BlackFly—the first series-produced ultralight eVTOL to reach the consumer market (2023)—doesn’t even have wheels. It takes off and lands on its curved keel. It’s also amphibious, behavior highly atypical in car.”

India Dispatch: “The result is a market that no longer resembles the concentrated duopoly that existed for over a decade until 2023. Amazon and Flipkart now find themselves squeezed between two models they cannot easily replicate because their expensive infrastructure, designed for catalogue breadth and next-day delivery, proves ill-suited for either 10-minute deliveries in dense urban areas or ultra-low-price commerce in India’s interior.”

World Models: An Overview and Marketing’s Future (Part 2)

Review – 2

WSJ: “Today’s AIs are book smart. Everything they know they learned from available language, images and videos. To evolve further, they have to get street smart. That requires “world models.” The key is enabling AI to learn from their environments and faithfully represent an abstract version of them in their “heads,” the way humans and animals do. To do it, developers need to train AIs by using simulations of the world… There’s an almost unanimous belief among AI pioneers that world models are crucial to creating next-generation AI. And many say they will be critical to someday creating better-than-human “artificial general intelligence,” or AGI… While the type of AI that makes large language models and ChatGPT possible gets all of the attention right now, it’s world-model-based AI that is gaining momentum in frontier research and could allow technology to take on new roles in our lives… LLMs appear to be taking over some functions in white-collar jobs already, and AIs that are world-model smart could allow them to take over yet more. Blue-collar work has been relatively safe. But as AI developers go ever deeper on world models, robots could start applying for jobs as truck drivers, plumbers or caregivers.”

WSJ: “Yann LeCun has been telling anyone who asks that he thinks large language models, or LLMs, are a dead end in the pursuit of computers that can truly outthink humans… “We are not going to get to human-level AI just by scaling LLMs,” he said.”

Fei-Fei Li: “Today, leading AI technology such as large language models (LLMs) have begun to transform how we access and work with abstract knowledge. Yet they remain wordsmiths in the dark; eloquent but inexperienced, knowledgeable but ungrounded. Spatial intelligence will transform how we create and interact with real and virtual worlds—revolutionizing storytelling, creativity, robotics, scientific discovery, and beyond. This is AI’s next frontier…Building spatially intelligent AI requires something even more ambitious than LLMs: world models, a new type of generative models whose capabilities of understanding, reasoning, generation and interaction with the semantically, physically, geometrically and dynamically complex worlds – virtual or real – are far beyond the reach of today’s LLMs… For spatial intelligence, I define world models through three essential capabilities: 1. Generative: World models can generate worlds with perceptual, geometrical, and physical consistency…2. Multimodal: World models are multimodal by design…3. Interactive: World models can output the next states based on input actions.”

Daniel Ince-Cushman: “In the evolving landscape of artificial intelligence, two prominent paradigms have emerged although World Models remain lesser known. One is the Large Language Model (LLM), a linguistic construct trained to predict and generate human-like text. The other is the World Model, a computational framework focused on simulating environments and enabling decision-making through internalized representations of external dynamics…World models…are designed to learn representations of environments, often through multi-modal sensory data or interactive simulation. These models typically consist of perception modules (e.g., vision encoders), memory or state encoders, and predictive components that simulate future states given current inputs and actions. These systems embody fundamentally distinct conceptions of intelligence. The LLM derives its power from statistical patterns in textual data, while the world model seeks to emulate causality and the laws of physics through learned simulations. The utility of a world model lies in its ability to enable planning. Through internal simulation, an agent can forecast outcomes and select optimal actions without direct environmental interaction.”

Gilbert Traverse: “The most valuable enterprise AI investments going forward will likely focus on integrating these two paradigms: leveraging LLMs for human-centric interfaces and general knowledge, while deploying increasingly sophisticated World Models to drive the complex planning, optimization, and action needed for real-world automation in areas like manufacturing, logistics, and autonomous operations. Strategic foresight demands an understanding of both, allowing organizations to capitalize on the linguistic power of today’s AI while building the causal foundation for tomorrow’s intelligent agents.”