Deep Dive

The Architecture of Autonomy: Walmart's Sparky and the Emergence of Agentic Commerce

How the World's Largest Retailer Is Rebuilding Commerce Around AI Super-Agents

Know Your Agent (KYA)February 20, 202624 min read

What Sparky Is

Sparky is Walmart's AI shopping agent. It lives in the Walmart app—the smiley-faced icon that pops up when you open the homepage. The name is an internal rebrand of what used to be a handful of disconnected AI features: search suggestions, review summaries, product comparisons. Now they're unified under one agent.

The pitch: instead of typing "paper towels" into a search bar and scrolling through 400 results, you tell Sparky "I'm hosting a backyard barbecue for 20 people this Saturday" and it builds you a shopping list. Charcoal, buns, plates, napkins, a cooler if you don't have one. It knows what's in stock at your local store and what can be delivered by Friday.

During Q4 FY2026, Walmart reported that customers using Sparky had an average order value 35% higher than non-Sparky users. The company's stated goal: e-commerce hitting 50% of total revenue within five years.

Sparky is the part you see. Underneath it sits a proprietary ML platform called Element, a family of retail-trained LLMs called Wallaby, a rebuilt search architecture, and a protocol strategy that deliberately opens Walmart's inventory to external AI agents like Gemini and ChatGPT. This piece covers how all of that fits together.

1. Where Sparky Sits

Four Levels of Agent Autonomy

Walmart talks about "agentic commerce" as a spectrum. Most retailers are at Level 1—basically a chatbot that answers questions. Walmart is pushing toward Level 4, where the agent handles the whole transaction.

Level	What It Does	Walmart Implementation
L1: Research	Answers questions, compares products.	Review summaries, feature comparisons.
L2: Intent-Driven	Suggests things based on what you're trying to do.	Meal planning, project-based grocery lists.
L3: Negotiation	Compares prices, delivery speeds, applies discounts.	Google UCP integration for cross-platform inventory.
L4: Execution	Handles checkout and fulfillment on its own.	Instant Checkout via ChatGPT and Gemini.

What's interesting is the architecture. Walmart isn't bolting AI onto existing features one at a time. They're building an orchestration layer where Sparky delegates to specialized sub-agents—one handles inventory, another handles pricing, another handles logistics. You see one icon. Behind it, a dozen things are coordinating.

2. The Stack

What's Actually Running Under the Hood

Sparky isn't a GPT-4 wrapper. Walmart built its own stack, and the choices tell you a lot about their cost model and what they think matters for retail AI.

Element Platform

Walmart's proprietary ML platform. It's built for retail-specific workflows: real-time inventory checks, pricing decisions, fulfillment routing. The key detail is that it's stateful—Sparky keeps memory across sessions. It remembers your dialogue history, your preferences, what you bought last time. Most chatbots forget you existed 30 seconds ago. Sparky doesn't.

Wallaby LLMs

Walmart's in-house language models, trained on decades of transaction data, supply chain logs, and customer interactions. The advantage over GPT-4 or Claude is specificity: Wallaby knows the Walmart catalog cold. It knows the difference between Great Value and the name-brand equivalent. It knows which one your zip code tends to buy. A general-purpose model trained on the internet can't do that.

Same structural advantage Amazon has with custom silicon: own the infrastructure, change the unit economics. Competitors building on third-party APIs pay per token. Walmart amortizes across 270 million weekly customers. That's hard to replicate.

3. How Search Actually Works

The Hybrid Retrieval System

Sparky only works if it can find the right products. Walmart rebuilt its search from scratch to handle this. The problem with keyword search is obvious: you type "something to keep my drinks cold at the beach" and a keyword engine returns nothing useful. Sparky returns coolers, insulated tumblers, and ice packs.

They built a hybrid system that combines traditional text matching with neural embeddings.

Component	How It Works	What It's Good At
Inverted Index	Traditional BM25 text matching.	Precise lookups: product IDs, model numbers, rare tokens.
ANN Fetcher	Approximate Nearest Neighbor search on neural embeddings.	Tail queries, synonyms, natural language intent.
Two-Tower Model	Separate encoding for queries and product attributes.	Handles millions of items without choking.
GBDT Re-ranker	Gradient Boosted Decision Trees for final sorting.	Mixes in user behavior and business rules for final ranking.

Results: 85–89% agreement with human evaluators on relevance. They trained it on over 6 million query-item pairs and use teacher-student distillation to compress the models down to versions fast enough for Walmart.com traffic.

It understands context, not just characters. Search for "apple" in grocery and you get fruit. Search for "apple" in electronics and you get the brand. If the agent can't understand what you mean, it can't buy the right thing.

4. The Four Super-Agents

It's Not Just Sparky

Sparky gets the press. But Walmart actually deployed four super-agents, and the interesting part is how they share data with each other.

Sparky — The Consumer Agent

The customer-facing one. Today it does review synthesis, item comparison, and handles queries like "What do I need for a beach trip?" The roadmap includes processing images and video—take a photo of your pantry and it tells you what you're low on. The bigger play is predictive: Sparky analyzes your seasonal patterns and purchase history to propose a full grocery list for your diet before you ask. You open the app and the list is already there.

Marty — The Partner and Advertiser Agent

The supplier-facing one, built into Walmart Connect. Suppliers use it for onboarding, order management, and ad bidding—all conversational. The interesting feature: it links real-time sales signals directly to ad performance. If a product starts trending, Marty can automatically shift budget to capitalize. Basically an AI media buyer for the Walmart ad platform.

The Associate Agent

For store employees. Pulls together sales data, shift planning, and inventory into one interface. Scheduling a week of shifts used to take 90 minutes. Now it takes 30. Not glamorous, but multiply that across 4,700 stores and it adds up fast.

WIBEY — The Developer Agent

Internal tooling for Walmart engineers. Scaffold a service, fix a pipeline, resolve a compliance flag—all through natural language. Walmart claims it lets their engineering teams ship 2–3x faster. It's agents building agents, basically.

5. The Protocol Bet

Why Walmart Chose Open Over Closed

This is where it gets strategically interesting. Amazon built Rufus as a closed system—they control the catalog, payments, identity, fulfillment, and the agent. All one stack. Walmart went the other direction.

At NRF 2026, Walmart announced it's adopting the Universal Commerce Protocol (UCP), built with Google. Here's how the competing protocols break down.

Protocol	Who	Philosophy
Universal Commerce Protocol (UCP)	Google & Walmart	Open standard. Interoperability and price competition.
Agentic Commerce Protocol (ACP)	OpenAI & Stripe	Ecosystem-integrated transactions within ChatGPT.
Trusted Agent Protocol	Visa	Secure, no-code frameworks for merchant acceptance.
Agent Pay Framework	Mastercard	Token-based security and cryptographic agent IDs.

Under UCP, an AI agent on Gemini or ChatGPT can find Walmart inventory, compare options, and check out—without the user ever leaving the conversation. The critical detail: Walmart stays the "merchant of record." They keep the customer relationship, the loyalty data, and the fulfillment. The AI platform gets the interface. Walmart gets the transaction.

Walmart is deliberately letting external AI platforms scrape its data and initiate checkout. The bet: give up the interface to become the default fulfillment backend for every agent that matters.

6. The Partnerships

What "Open" Looks Like in Practice

So what does this actually look like? Two deals show the model.

Google Gemini

You can search, compare, and buy Walmart products inside a Gemini conversation. Link your account and it pulls your purchase history—online and in-store—for personalized picks. Walmart+ benefits apply automatically. You never leave Gemini, but everything ships from Walmart.

OpenAI / ChatGPT

"Instant Checkout" inside ChatGPT. Tell it you need stuff for taco night, it picks the tortillas, ground beef, cheese, and salsa from Walmart's catalog, charges your card, and schedules delivery. No app switching. No cart. You just have a conversation and groceries show up.

Walmart calls this "AgenTek." Let the AI platforms own the conversation. Make Walmart the pipes they all run on. It's the opposite of Rufus, where Amazon owns both the interface and the backend.

7. The Numbers

What's Actually Moving

35%

Higher AOV for Sparky users vs. non-Sparky

30 min

Shift planning time (down from 90 min)

18 weeks

Faster fashion production timeline

40%

Improvement in customer care resolution

The 35% AOV lift is the big one. The mechanism is simple: Sparky turns single-item purchases into baskets. You ask for help planning a kid's birthday party and it adds decorations, plates, a cake mix, candles, gift bags, and a piñata. You came for one thing. You left with twelve.

The Supply Chain Side

This isn't just a front-end play. Walmart has nearly 90 million IoT sensors and digital twins of its facilities running real-time monitoring. By December 2025, about 65% of its 4,700 U.S. stores had automated distribution, and 55% of fulfillment center volume ran through Symbotic's robotics.

<3 hrs

Delivery for 35% of orders

65%

Stores with automated distribution

20%

Reduction in unit delivery costs

The 20% delivery cost reduction is what funds everything else. That margin goes straight back into "Everyday Low Price" positioning. The AI isn't a cost center—it's what makes the core business model work at scale.

8. Security

Who Gets Verified When Agents Spend Money

When an AI agent charges someone's credit card, who's responsible? That question gets complicated fast. Walmart, working with Mastercard, has rolled out the Agent Pay framework to handle it.

Mechanism	What It Does
Agentic Tokens	Dynamic credentials tied to a specific agent, merchant, and transaction type. A grocery token can't be reused at a different merchant. Steal one and it's worthless.
Biometric Authorization	Human approval via biometric link for big or unusual purchases. The agent can't spend without you signing off.
OAuth 2.0 Linking	Links the agent's identity to the merchant's system. Keeps your loyalty data and preferences intact across sessions.
Verifiable Credentials	Cryptographic proof that a real user authorized the transaction, not a spoofed agent.

Walmart also has a "Responsible AI Pledge"—transparency, security, privacy, fairness, accountability, customer-centricity. Standard governance language, but they do run every new product through an internal AI review before it ships.

Here's the thing: inside Walmart's own app, they control the trust layer. But the open strategy means agents from Gemini, ChatGPT, and potentially anyone else can initiate transactions too. How do you verify that an agent showing up at checkout is actually authorized? Walmart's system handles it internally. The open web doesn't have an answer yet.

9. The Problems

What Could Go Wrong

The open strategy has an obvious downside.

You Lose the Customer Relationship

If people research, discover, and check out on Gemini or ChatGPT, they stop thinking of themselves as "Walmart shoppers." They're "Gemini shoppers" who happen to get packages from Walmart. A 2026 Deloitte survey found that 81% of retail executives think generative AI will weaken brand loyalty by 2027. Walmart is betting that fulfillment matters more than the interface. They might be right. But it's a real gamble.

Your Next Customer Is Software

Walmart's counter-argument: the "new customer" is the AI agent itself. Agents don't care about brand storytelling or emotional marketing. They care about structured data, reliable fulfillment, and competitive pricing. So Walmart is optimizing for agents as the buyer, not humans. Make the data clean, the APIs fast, and the fulfillment reliable. The brand relationship with the human? That's Gemini's problem now.

10. What Brands Should Do

Optimizing for Agents, Not Humans

If you're a brand selling on Walmart, the rules are changing. Keyword stuffing your product titles doesn't work when the "search engine" is an LLM that reads structured attributes. Sparky and Wallaby parse data, not marketing copy.

Fill Out Every Attribute

Color, size, weight, material—every field needs to be correct and complete. Walmart's SuperTruth and Momentum data initiatives penalize incomplete listings. If Sparky can't parse your product data, it won't show your product. Period.

A/B Test for Conversations, Not Grids

Small changes in images or titles shift how Sparky ranks products. What works in a search results grid doesn't necessarily work when an agent is narrating options to a user. Test your content in both contexts.

Write for Use Cases, Not Keywords

People ask Sparky "what sunscreen is good for sensitive skin?" not "SPF 50 sunscreen." Your descriptions need to match: "suitable for sensitive skin," "perfect for a 10th birthday party," "fits standard carry-on dimensions."

Get on Marty Now

Early adopters of Walmart's partner agent will have more data to optimize against when the ad auction gets crowded. Same logic as early Google Ads adopters—the brands that figured out the system first had a lasting advantage.

Where This Goes

Walmart and Amazon are building toward the same thing—agents that shop for you—but they're taking opposite routes.

Closed vs. Open

Amazon controls the full stack. Rufus owns discovery, checkout, and fulfillment. Trust is implicit because it's all one system. Walmart is breaking the stack apart: Sparky does discovery in the app, but checkout can happen on Gemini, ChatGPT, or anything else that speaks UCP. When you cross organizational boundaries like that, trust has to be built explicitly. It can't just be assumed.

Open Means You Need Verification

Inside Amazon, "is this agent authorized?" is easy—Amazon issued the agent, Amazon trusts it. In Walmart's model, agents from Google, OpenAI, and potentially anyone can start a transaction. Agentic Tokens and Agent Pay are a start. But portable, verifiable agent identity that works across merchants and platforms? That's still an open problem.

The $1 Trillion Signal

Walmart crossed $1 trillion in market cap in early 2026. For a big-box retailer, that's unusual. The market is pricing in the bet that this "people-led, tech-powered" strategy works. Whether it does depends on execution across all four super-agents, the protocol standards maturing, and someone solving the trust question for agents on the open web.

Sparky isn't a chatbot bolted onto a shopping app. It's the front end of a new commerce stack. The retailers and brands that understand that will adapt. The ones that treat it as a feature update won't.

Agents Are Already Transacting

Walmart's open model works because they control the backend. For everyone else, you need a way to verify which agents are authorized and which aren't. That's what KYA builds.

Talk to the KYA Team

Previous: The Dawn of Agentic Commerce All Posts