Skip to main content

How LLMs read products

When a shopper asks ChatGPT, Perplexity or Gemini for a product recommendation, the model doesn't scroll your website. It queries a catalog index — a structured representation of products, scored by semantic relevance to the query.

The quality of that representation decides whether you appear.

What LLMs look for

  1. Clear semantic identity — one sentence answering "what is this product and who is it for?". Vague titles ("SKU-23487 · Premium Item") are skipped.
  2. Structured attributes — material, use case, size, season, audience. JSON-LD is the preferred format.
  3. Bilingual keywords — people ask in the language they think in. EN + ES coverage doubles your reach in Spanish-speaking markets.
  4. Reasoningwhy should someone buy this vs the alternatives? When is it the right choice?
  5. Signals of quality — reviews, returns rate, certifications. Currently hard to acquire at scale, but improving.
  6. llms.txt — a tenant-level manifest telling crawlers what your catalog offers and where to find structured data.
  7. Markdown endpoints — clean text/markdown representations of products and entities. Sites exposing structured content as markdown score higher on Cloudflare's agent-readiness metric.

What Clione does about each

LLM needClione artifact
Semantic identitycore_identity text
Structured attributessynthetic_properties JSON + JSON-LD
Bilingual keywordssearch_keywords[] (EN + ES)
Reasoningreasoning.{recommended_for, decision_logic, objection_handler}
Quality signalsQuality scores (⚠ currently LLM-inferred — see Quality Scores)
Catalog manifest/.well-known/llms.txt served per-tenant
Markdown for crawlers.md endpoints on every entity (text/markdown)

Why the traditional SEO stack isn't enough

Keyword stuffing, backlink farming, exact-match titles — these assume a keyword-matching crawler. LLMs use embeddings: they compare the meaning of the query against the meaning of your product text. Well-written product identity beats aggressive SEO copy every time.

That's the whole bet of Clione: enrich once, serve semantically, win the LLM channel.

It's not just products

Every entity in your catalog — products, collections, categories, and pages — gets the full signal treatment. The same four signal formats (.jsonld, .llm, .md, .meta) are available for all entity types. Collections get CollectionPage schema, categories get their own JSON-LD, pages get WebPage schema. All bundled with FAQPage if FAQ entries exist. All with markdown for Cloudflare agent-readiness. The LLM doesn't just see your products — it sees your entire store's semantic structure.