2025 – 2
Claude
- DeepSeek R1: China’s $6M Wake-Up Call to the West
In January 2025, Chinese company DeepSeek released its R1 model under the MIT License, achieving performance comparable to OpenAI’s o1 on benchmarks like AIME and MATH. The company claims that it trained its V3 model for US$6 million—far less than the US$100 million cost for OpenAI’s GPT-4 in 2023—and using approximately one-tenth the computing power consumed by Meta’s comparable model.
The release rattled markets, causing NVIDIA’s stock to dip 17 percent amid fears that demand for its high-performance GPUs might not be as essential as thought. If the United States does not double down on AI infrastructure, incentivize an open-source environment, and overhaul its export control measures, the next Chinese breakthrough may actually become a Sputnik-level event.
Why it matters: DeepSeek proved that frontier AI capabilities don’t require frontier budgets. It validated reinforcement learning-first approaches and demonstrated that U.S. export controls on chips couldn’t kill innovation—only redirect it.
- Reasoning Models Came of Age
Reasoning defined the year, as frontier labs combined reinforcement learning, rubric-based rewards, and verifiable reasoning with novel environments to create models that can plan, reflect, self-correct, and work over increasingly long time horizons.
The duration of tasks that large language models can reliably complete with at least a 50 percent success rate has been doubling every seven months since 2019, according to METR. In 2019, leading models could only manage tasks requiring a few seconds of human effort. In 2025, Anthropic’s Claude 3.7 Sonnet boasts a “time horizon” of 59 minutes, and Claude 4.5 has extended this to more than 30 human hours.
Why it matters: The shift from “next token prediction” to genuine step-by-step reasoning transformed what AI could accomplish—from answering questions to completing complex, multi-hour professional tasks.
- GPT-5 and the Frontier Model Race Intensified
OpenAI launched GPT-5 on August 7, 2025, combining reasoning and non-reasoning functionality under a common interface with state-of-the-art performance on various benchmarks. GPT-5 achieved 94.6% on AIME 2025 without tools, 74.9% on SWE-bench Verified, and is approximately 45% less likely to hallucinate than GPT-4o.
Google released Gemini 3 in November, calling it “AI for a new era of intelligence” and the best model in the world for multimodal understanding—its most powerful agentic and vibe coding model to date.
OpenAI declared a “code red” to accelerate GPT-5.2’s release by December 9, 2025, countering Google’s Gemini 3, which topped leaderboards and wowed Sam Altman himself.
Why it matters: The major labs are now releasing frontier models every few months, not years. Competition is driving rapid capability gains—but also raising questions about safety and sustainability.
- AI Became a Scientific Collaborator
AI is becoming a scientific collaborator, with systems like DeepMind’s Co-Scientist and Stanford’s Virtual Lab autonomously generating, testing, and validating hypotheses. In biology, Profluent’s ProGen3 showed that scaling laws now apply to proteins too.
In 2026, AI won’t just summarize papers, answer questions and write reports—it will actively join the process of discovery in physics, chemistry and biology. “AI will generate hypotheses, use tools and apps that control scientific experiments, and collaborate with both human and AI research colleagues,” says Peter Lee, president of Microsoft Research.
Google marked the 5-year anniversary of AlphaFold cracking the protein folding problem. The profound scientific and societal value of this work was recognized in 2024 with the Nobel Prize in Chemistry.
Why it matters: AI is moving from being a tool scientists use to being a collaborator that proposes hypotheses and designs experiments—potentially accelerating discovery by orders of magnitude.
- Agentic AI Moved from Concept to Production
By late 2025, 52% of executives report their organizations have already deployed AI agents, with 88% achieving measurable ROI. Agentic commerce—shopping powered by autonomous AI agents—is transforming from experimental technology into competitive necessity.
Traffic to US retail sites from GenAI browsers and chat services increased 4,700% year-over-year in July 2025, according to Adobe. Customers arriving via AI agents are 10% more engaged than traditional visitors, reaching retailers further down the sales funnel with a stronger intent to purchase.
OpenAI announced an Agentic Commerce Protocol, codeveloped with Stripe, which allows users to complete purchases within ChatGPT without leaving the chat. Shopify is developing an agentic shopping infrastructure that allows agents to tap into its catalog and build carts across merchants. Amazon, Google, PayPal, Mastercard, and others are also developing agentic shopping services.
Why it matters: AI agents aren’t just answering questions—they’re booking travel, comparing prices, negotiating deals, and completing purchases. The intermediary layer between brands and consumers is being rewritten.
- Enterprise AI Adoption Crossed the Chasm
Commercial traction accelerated sharply. Forty-four percent of U.S. businesses now pay for AI tools (up from 5% in 2023), average contracts reached $530,000, and AI-first startups grew 1.5× faster than peers, according to Ramp and Standard Metrics.
Our inaugural AI Practitioner Survey, with over 1,200 respondents, shows that 95% of professionals now use AI at work or home, 76% pay for AI tools out of pocket, and most report sustained productivity gains—evidence that real adoption has gone mainstream.
Software development activity on GitHub reached new levels in 2025. Each month, developers merged 43 million pull requests—a 23% increase from the prior year. The annual number of commits jumped 25% year-over-year to 1 billion.
Why it matters: AI moved from pilot projects to production deployments. The question shifted from “should we use AI?” to “how fast can we scale it?”
- Small Language Models Proved Bigger Isn’t Always Better
2025 saw a pivot toward efficient small language models (SLMs). Cost pressure, latency requirements, and privacy demands forced enterprises to rethink the assumption that bigger is better. Models with 3B to 15B parameters became workhorses of the year, often costing under $0.0001 per request to run.
Fine-tuned small language models are built for specific purposes and trained on focused data, providing high accuracy for their specialized tasks. They’re breaking the old adage: “Between good, cheap and fast, choose two.” These SLMs can provide all three benefits, often performing comparatively with larger models in accuracy while outperforming on speed and costs.
Domain-specific language models (DSLMs) fill the gap where generic LLMs fall short for specialized tasks. By 2028, Gartner predicts that over half of the GenAI models used by enterprises will be domain-specific.
Why it matters: The “race to the biggest model” gave way to a focus on efficiency, specialization, and practical deployment. Enterprises realized they needed models optimized for their specific use cases, not general-purpose giants.
- AI Search Began Eating Traditional Search
ChatGPT Search now handles over a billion searches weekly. It shows links by design and is adding product-style results. The discovery pie is growing—but it’s being sliced differently.
ChatGPT now owns 4.3% of search share—a jaw-dropping rise for a platform that wasn’t even on the radar a few years ago, and it’s racked up over 400 million weekly users.
AI platforms are expected to drive more website visits than traditional search engines in the next three years. Traffic from large language models rose from about 17,000 to 107,000 sessions comparing January-May 2024 with the same period in 2025.
Why it matters: Brand discovery is being rewired. When consumers ask ChatGPT “what’s the best X?” instead of Googling, decades of SEO investment become less relevant. Brands must now optimize for AI answers, not just search rankings.
- Generative AI Transformed Marketing (from “Walk” to “Run”)
72% of marketers identified GenAI as the most important consumer trend heading into H2 2025—a 15-point increase from late 2024. GenAI ranked first across every vertical surveyed, from telecom and retail to travel and financial services.
90% of marketers now use AI for text-based tasks, with the most common applications being idea generation (90%), draft creation (89%), and headline writing (86%). ChatGPT dominates at 90% usage, followed by Google Gemini at 51% and Claude at 33%.
GenAI is now being used for copywriting by 34% of marketers, image generation by 25%, and creative versioning by 25%. Website development usage jumped nearly 70% since last year.
Why it matters: Marketing teams moved from experimenting with AI to deploying it across content creation, personalization, and campaign optimization at scale. The productivity multiplier became real.
- The Martech Stack Exploded (and Imploded Simultaneously)
The martech landscape grew 9% to 15,384 solutions, with AI-native products continuing to blossom while the previous generation consolidated.
62.1% of respondents now use more tools than two years ago, with generative AI tools used by 68.6% of organizations—making them the 6th most popular martech tool category in just two years.
Popular AI assistants like ChatGPT, Claude, and Gemini are creating software programs behind the scenes to do users’ bidding without them even knowing software was built. The result: not millions but billions of custom software programs proliferating across digital operations.
Why it matters: The “hypertail” of custom AI-powered apps emerged alongside commercial tools. Marketers can now build their own solutions without engineering teams, fundamentally changing the build-vs-buy equation.