Everyone has been throwing around "generative AI" for eighteen months, but few teams know where it actually moves the needle. We deployed LLMs across seven D2C brands between 2025 and 2026. Here's what works, what's marketing, and what we dropped.

Dynamic recommendations: yes, but not the way you think

Collaborative filtering algorithms ("people who bought X also bought Y") aren't new. What changes with an LLM is the ability to generate contextual merchandising — not just "here are 4 products", but "here are 4 products AND why they complete your choice".

On Maison Lerouge, we're up +28% AOV since we generate these blocks on the fly. The template: take the cart in progress, the LLM summarizes the implicit "style profile", and generates 3 lines of contextual recommendation. Cost: ~€0.008 per session with GPT-4o-mini.

Generated product copy: time saved, personality lost (sometimes)

On a 12,000-SKU catalog, writing 12,000 descriptions by hand is dead. GPT-4o-mini generates 80 entries/hour for ~€3. But out of those 80, 5 to 10 stray from the brand register — too generic, too AI-soup.

Solution: light fine-tuning on 200 existing entries + an automatic eval layer with your tone of voice. Eval cost ~€0.03/entry. Min score: 7/10 on the brand grid. Below that, regenerate; below 5/10 twice, a human edits.

Support agents: where they actually close tickets

The myth: "AI will replace your support". The reality: a well-built agent handles 40 to 60% of level-1 tickets (order tracking, returns, sizing). Beyond that, it escalates.

On Klima, 52% of conversations end without human intervention — but we invested six weeks on the RAG (return policy, FAQ, product sheet) and the eval pipeline. Without that foundation, we were at 12% with sub-3/5 user satisfaction.

What we dropped

  • AI product image generation: still too variable in quality in 2026. Human photographer remains more profitable, especially for brand trust.
  • Extreme personalization ("every visitor sees their own site"): too costly on the tracking side, weak return (+1.8% measured conversion).
  • AI chatbot on the homepage: conversion < 0.3%, killed.
  • High-volume auto-generated SEO content: Google penalizes. Dead since the Helpful Content update.

The pattern that works: narrow + measured

One specific use case, a simple eval pipeline, a go/no-go based on numbers. Not "we deploy AI on everything", but "we deploy on level-1 support and watch the resolution rate at 30 days".

If you want an AI audit for your e-commerce, we do that in five days. Brief a project.