Quality Scores — origin and limitations
Every enriched product carries five quality-related fields:
durability_score— 1–10quality_perception— 1–10value_for_money— 1–10price_positioning—budget/mid-range/premium/luxurytypical_competitors— array of strings
This page documents where those numbers come from, what they can and can't tell you, and the roadmap to improve them.
These are LLM-inferred opinions based on your catalog text. They are not backed by reviews, returns, certifications, sales data or any external evidence — yet.
Origin
The values are produced in a single LLM call per product, using this prompt (simplified — source: apps/products-api/src/lib/promptTemplates.ts):
You are a product analyst. Analyze the product below and return a single JSON object.
Product
-------
Title: {{title}}
Description: {{description}}
Price: {{price}} {{currency}}
Vendor: {{vendor}}
Categories: {{categories}}
Enrichment hints: {{enrichment_hints}} # from the Enrichment Wizard, if filled
Return ONLY valid JSON:
{
"durability_score": <number 1-10>,
"quality_perception": <number 1-10>,
"value_for_money": <number 1-10>,
"typical_competitors": ["<competitor 1>", "<competitor 2>"],
"price_positioning": "<budget|mid-range|premium|luxury>"
}
What the LLM uses
- Title
- Description
- Price + currency
- Vendor
- Categories
- Optional
enrichment_hintsfrom the Enrichment Wizard
What the LLM does NOT use
- Real reviews (Google / Trustpilot / internal)
- Returns rate
- Sales / conversion data
- External catalog comparatives
- Manufacturer certifications (ISO, CE, GOTS, OEKO-TEX, MIL-STD, etc.)
- Lifecycle test results
- Sustainability databases
Consequences
- Scores are synthetic opinions — not verified facts.
- Reproducibility is not guaranteed — same product can score differently across runs if the model temperature is non-zero.
- Useful relatively, not absolutely — the LLM has common sense about market tiers (Hermès → luxury, Primark → budget) so scores differentiate products inside your own catalog. They're not a benchmark against competitors.
typical_competitorshas the same limitation — inferred, not verified.
UI disclosure
The dashboard shows a badge above every Quality Scores block indicating the evidence level:
| Badge | Meaning |
|---|---|
| 🟢 Data-grounded | External evidence backs the score (future: reviews, returns, certifications) |
| 🔵 AI + owner hints | Enrichment Wizard was filled for this product |
| 🟡 AI-inferred | Pure LLM opinion (default — synthetic) |
The level is stored in metadata.scores_evidence_level and persisted across re-enrichments.
How scores improve over time
Scores start as AI-inferred (pure LLM opinion) and become more accurate as you provide additional evidence:
- Fill the Enrichment Wizard — answering questions about warranty, certifications, and materials upgrades your scores to AI + owner hints.
- Add verified competitors — replacing LLM-inferred competitors with your own verified list in the Edit tab gives the system ground truth for positioning.
- External data integrations — connecting Google Analytics, Search Console, or reviews data enables data-grounded scores backed by real-world evidence.
The three-badge system (data-grounded, AI + hints, AI-inferred) always shows the current evidence level so you know how much to trust each score.