XML Feed Analyzer: How to Understand Supplier Product Feeds

A practical guide to understanding the structure and content of supplier XML product feeds.

2026-01-29
Back to blog
Post
Analyze your product feed in seconds
Upload your XML or CSV feed and see what’s really inside before importing it into your shop.

XML Feed Analyzer: How to Understand Supplier Product Feeds

If you import products from suppliers, an xml feed analyzer workflow helps you understand what is really in a supplier XML feed before it reaches your catalog.

XML feeds can look “complete”, but problems like missing attributes, unclear variants, mixed languages, or price fields with different VAT meaning often show up only after import.

In this guide, you’ll learn a practical process to analyze supplier product feed XML files:

  • identify the repeating product item structure
  • choose a stable ID strategy (and understand variants)
  • map the fields you actually need (and what they mean)
  • run a short inspection checklist so imports are more predictable

The examples assume typical EU e-commerce reality: one supplier feed reused across markets, multiple currencies, and content not tailored to your local language.

What an XML product feed usually looks like

An XML product feed is usually one document with:

  • one root element (the top-level container), and
  • a repeated product item element (the thing you import as a product or a variant)

A simple custom structure might look like this:

<products>
  <product>
    <id>123</id>
    <name>Example product</name>
    <price>19.90</price>
  </product>
  <product>
    <id>124</id>
    <name>Another product</name>
    <price>29.90</price>
  </product>
</products>

In real supplier XML feeds, product items often include nested fields (brand, categories, images), and repeated sub-elements (multiple images, multiple categories, multiple languages).

There is no single standard. Two suppliers can both provide an “xml product feed” while using different element names and nesting. Your first job is to understand the xml feed structure the supplier chose.

If you prefer a flatter structure for spreadsheets or quick manual checks, it can be useful to convert first. This is where it helps to know when to convert an XML product feed to CSV and what you gain (and lose) with that conversion.

Step 1: Identify the product item element (and any namespaces)

Start by finding the repeated element that represents a product or a variant. This is the element you will loop over during import.

Common structures you will see

  • Custom supplier XML: often looks like <products><product>...</product></products> or <items><item>...</item></items>
  • RSS-like “channel/item” structures: sometimes used with namespaces (for example g: fields)
  • Marketplace-specific exports: shaped around a marketplace’s required fields and naming conventions

Example of an RSS-like structure:

<rss>
  <channel>
    <item>
      <title>...</title>
      <g:id>...</g:id>
    </item>
  </channel>
</rss>

Namespaces: why fields can “look missing”

Namespaces are a common reason fields appear missing in a viewer or parser.

You may see prefixes like g: (for example g:id, g:price), or a default namespace declared on the root element.

Practical rule:

  • If the feed contains fields like g:price but your tool shows only a few fields, you may not be resolving namespaces.
  • When you build mappings, make sure your parser or tool can address namespaced fields correctly.

Step 2: Choose an identifier strategy and confirm how variants work

Before you map anything else, decide how you will identify products across imports. This is the difference between clean updates and duplicate products.

Typical ID fields and what they mean

Common candidates in a supplier xml feed:

  • Supplier internal ID (id, product_id): often stable within the supplier, but not meaningful outside that context.
  • SKU (sku, code): sometimes stable, sometimes changes when the supplier reorganizes.
  • EAN/GTIN (ean, gtin): useful for matching, but can be missing or reused incorrectly in some categories.
  • MPN (mpn): manufacturer part number, often incomplete for non-branded items.

What you can use in practice:

  • For deduplication and updates, prefer a stable supplier ID or stable variant ID if the feed provides it.
  • If the supplier provides both product-level and variant-level IDs, document how they relate (parent-child or independent items).
  • Treat EAN/GTIN as a helpful additional key, not always a safe primary key.

Variants: two common patterns

Variant representation differs widely. Two common patterns:

Pattern A: one item per variant

Each <product> is a variant (size/color). Variant attributes are fields:

<product>
  <id>123-RED-M</id>
  <parent_id>123</parent_id>
  <color>Red</color>
  <size>M</size>
</product>

Pattern B: one item with nested variants

One product item contains multiple variants:

<product>
  <id>123</id>
  <name>...</name>
  <variants>
    <variant><id>123-RED-M</id><size>M</size></variant>
    <variant><id>123-RED-L</id><size>L</size></variant>
  </variants>
</product>

How to spot variant handling issues early:

  • If you see repeated “same name, different size/color”, you likely have one item per variant.
  • If you see nested <variants>, your importer must handle nested loops.
  • Look for stable variant IDs. If variant IDs are missing or generated, updates can create duplicates.
  • Check if stock and price are per variant or per parent. In EU shops, pricing can vary by size, and stock often differs per variant.

Step 3: Map required fields and document field meaning

Once you know the product item element and ID strategy, map the fields you need for import. Keep the first mapping simple and expand later.

Required fields for most imports

Most imports need at least:

  • ID (and parent ID if variants exist)
  • Title / name
  • Price
  • Currency (or a documented assumption)
  • Availability / stock (quantity or status)
  • Product URL
  • Image URL (at least one)

Often needed downstream for filtering and merchandising:

  • Brand
  • Category (or supplier category mapping field)
  • GTIN/EAN (if available)

Common naming patterns (same meaning, different field names)

Suppliers use different names for the same concept. Examples you might see:

  • price: price, price_vat, gross_price, net_price, sale_price
  • stock: stock, qty, quantity, availability, in_stock
  • coded values: Y/N, true/false, 0/1, or “in stock/out of stock”

Write down your mapping decisions explicitly:

  • which field you chose
  • which alternatives exist
  • what transformations you will apply (decimals, VAT meaning, trimming spaces, parsing booleans)

Descriptions: often present, rarely ready to publish

Descriptions in supplier feeds are often the least reliable part of the data.

What to check:

  • Length: very short descriptions that do not explain the product
  • Language: one feed reused across EU markets often mixes languages
  • Duplication: the same text repeated across many products
  • HTML: embedded tags that break layouts or include formatting you do not want
  • Placeholders: “TBA”, “See website”, or content assuming the supplier’s store context

For a deeper explanation and practical options, see why supplier product descriptions are often not enough.

Step 4: Run sanity checks before import (structure, locale, consistency)

Before you build a full import mapping, validate the structure and sample items.

If you want a concrete way to inspect the file, you can upload a feed to inspect its structure and sample products in the app to confirm which fields exist, how often they are present, and what values look like on real items.

Quick structural checks

  • Count items: does the number of product items match what the supplier promised?
  • Check empties: are important fields present but empty (for example <ean/>, <price></price>)?
  • Check coverage: are required fields present for most items, or only for a subset?
  • Check repetition: are there multiple images/categories, and are you handling them intentionally?

Tip: sample 20–50 items across different categories (not only the first items in the file).

Price and currency checks (EU reality)

Pricing problems are common because feeds mix different assumptions:

  • Currency presence: is there a currency field, or does the supplier assume EUR?
  • Decimal format: 19.90 vs 19,90; watch for thousands separators (1,299.00 vs 1 299,00)
  • Net vs gross: fields like net_price and gross_price can exist together
  • VAT flags: do not guess; document what you use
  • Multiple price fields: regular price, sale price, recommended retail price

Practical check:

  • Pick one product you know from the supplier’s website or price list and verify that your chosen field matches the expected price and VAT meaning.

Language and content checks

For EU shops, language issues are normal, especially when the same feed serves multiple markets.

Check:

  • Is the description in the language you need for the target shop?
  • Is content mixed-language within the same feed?
  • Are there language-specific fields (for example <description_en>, <description_de>)?

If you sell in multiple languages, decide whether you will import one language only, or maintain separate language fields per market.

URL and image checks

URLs and images often cause hidden issues because they “look fine” but fail at scale.

Check:

  • Product URLs: valid, properly encoded, consistent domain, HTTPS
  • Image URLs: reachable, correct file format, not placeholders, not blocked by hotlink rules
  • Tracking parameters: decide if you keep or remove them for consistency

If you can, test a small random set of URLs (10–20) across the feed.

Encoding and formatting basics (to avoid broken imports)

These checks help you avoid “everything failed to parse”.

Encoding (UTF-8 vs legacy encodings)

Most modern feeds should be UTF-8. If encoding is wrong, you will often see:

  • broken accented characters (common in EU languages)
  • replacement characters like ? or

What to look for:

  • the XML declaration at the top of the file:
<?xml version="1.0" encoding="UTF-8"?>

If there is no encoding declaration, the file may still be UTF-8, but confirm with your tooling before import.

Special characters and XML validity

A common XML validity issue in supplier feeds is unescaped special characters inside text:

  • & should be written as &amp; in text nodes
  • < should not appear inside text unless escaped or inside CDATA

You may also see CDATA blocks:

<description><![CDATA[Some text with <b>HTML</b> & symbols]]></description>

CDATA can be fine, but it can also hide HTML you do not want in your store. Treat it as a content quality check, not only a parsing detail.

Product feed inspection checklist (copy/paste)

Use this “product feed inspection checklist” before you import:

  • Identify root element name and product item element name
  • Confirm namespaces (if any) and how fields are addressed
  • Confirm your unique ID strategy (stable across updates)
  • Confirm variant representation (one item per variant vs nested variants)
  • List the fields you will import (required + optional)
  • Run a missing-field check on a sample of products (20–50 items)
  • Validate price fields: currency, net/gross meaning, decimal format
  • Validate stock/availability semantics (what do “0”, empty, “N” mean?)
  • Validate product URLs and image URLs (format + basic reachability)
  • Check description: language, HTML, duplication signals, placeholders
  • Save notes for mapping rules / transformations needed (per field)

Pitfalls and edge cases

Pitfall 1: namespaces hide fields in your parser or viewer

Symptom:

  • the feed “looks empty” or you only see a few fields, even though the XML contains more

What to do:

  • check for prefixes like g: or a default namespace on the root element
  • make sure your tool resolves and addresses namespaced fields correctly

Pitfall 2: multiple price fields (net/gross/VAT) create silent pricing errors

Symptom:

  • your shop shows a price, but it is consistently too low or too high

What to do:

  • list all price-like fields (net_price, gross_price, price_vat, sale_price)
  • verify one known product against the supplier’s reference price
  • document whether your chosen field includes VAT and what currency is assumed

Pitfall 3: variant handling causes duplicates or missing variants

Symptom:

  • only one size/color appears after import, or updates create duplicates instead of updating

What to do:

  • confirm whether each feed item represents a variant
  • ensure there is a stable variant ID
  • confirm whether stock and price are variant-level (common) or product-level

Pitfall 4: invalid characters break XML parsing

Symptom:

  • the importer fails with a parsing error, often pointing to a specific line/column

What to do:

  • look for unescaped & in text fields (especially descriptions)
  • check that the XML is well-formed and the declared encoding matches the actual file
  • treat embedded HTML carefully (especially when not inside CDATA)

Pitfall 5: supplier updates change field names or semantics

Symptom:

  • your mapping worked last month, but now imports produce missing fields or wrong values

What to do:

  • re-run the checklist after supplier feed updates
  • watch for renamed fields (availabilitystock_status) or changed meaning (price becomes net instead of gross)
  • keep a short mapping note so changes are visible and not guessed

FAQ

Which field should I use as the product ID?

Use the most stable identifier that the supplier keeps consistent across updates. If variants exist, prefer a stable variant ID (and keep a parent ID relationship if available). Use EAN/GTIN as an additional reference, not always as the only key.

What if the feed has prices but no currency field?

Do not guess silently. Ask the supplier what currency is used, or confirm from documentation. In EU contexts, “assumed EUR” is common, but the risk is high if you sell in multiple markets or if the supplier supports multiple currencies.

What if my supplier provides CSV instead of XML?

CSV can be easier to inspect manually, but it has its own risks (delimiter issues, quoting, encoding). If you are deciding between formats, start by defining the fields you need and checking whether your tooling handles the CSV reliably (delimiters, quoting, encoding).

Conclusion: analyze first, then import

An import is more predictable when you treat feed analysis as a repeatable process:

  • identify the product item element and namespaces
  • choose a stable ID strategy and confirm variant representation
  • map required fields and document naming and meaning
  • run sanity checks for price, currency, language, URLs, and images

Next step: inspect the feed before you import

If you want to validate a supplier feed quickly, you can inspect the feed structure and sample products in the app and use the results as mapping notes for your importer.

Stop importing feeds blindly
  • See which fields are missing or inconsistent
  • Detect product records automatically
  • Convert XML ↔ CSV before import
Analyze your feed now →
Or jump straight to the analyzer → /app