Skip to main content

xml-encoder

An XML encoder escapes the five characters that have special meaning in XML β€” & < > " ' β€” by replacing them with their named entity references (& < > " ') so XML parsers do not mistake content for markup, and a decoder reverses the process. The ZTools XML Encoder runs entirely in the browser, supports both attribute-context and element-context encoding (which differ slightly in what must be escaped), handles numeric character references for non-ASCII characters, and is the right pre-flight when injecting user input into XML documents (RSS feeds, SOAP envelopes, sitemaps, OPML, configuration files).

Use cases​

Building RSS / Atom feeds​

Feed items contain user titles with "&" and "<". Encode before placing inside <title> elements so the feed validates.

Generating sitemap.xml​

URLs with query strings often contain "&". Without encoding, the sitemap fails XML validation. Encode before embedding.

SOAP / XML API payloads​

Older SOAP APIs use raw XML. User input in fields like <name> or <comment> must be encoded to avoid breaking the envelope.

Config-file generation​

Many tools (Maven, Spring) use XML config. When generating dynamically, encoded user-supplied values prevent injection and breakage.

How it works​

  1. Pick context β€” Element content (escapes & < >) OR attribute (also escapes " '). Different contexts require different escapes.
  2. Paste text β€” Raw text containing special characters.
  3. Encode β€” Each special character replaced by its named entity. Non-ASCII characters optionally encoded as numeric references &#xNNNN;.
  4. Inspect β€” Side-by-side original / encoded view. Highlights which characters changed.
  5. Copy β€” Encoded output to clipboard. Reverse mode decodes back.

Examples​

Input: AT&T <best>

Output: AT&T <best>


Input: price="$10"

Output: price="$10" (attribute context only)


Input: &lt;tag&gt; β†’ decode

Output: <tag> (one round of decoding)

Frequently asked questions​

Why are there 5 named entities and not more?

XML's spec defines only 5 (& < > " '). HTML defines hundreds. If you need more, use numeric references (&#NNNN;) which work for any Unicode codepoint.

When do I need to encode " and '?

Only inside attribute values delimited by that quote. Inside element content (between tags), they are not special.

What about CDATA sections?

CDATA blocks (&lt;\![CDATA[ ... ]]&gt;) escape literal characters by wrapping rather than encoding. The encoder offers a CDATA-wrap mode for content with many special characters.

Are HTML named entities (e.g. Β ) allowed in XML?

No β€” only & < > " ' are predefined in XML. Use numeric (Β ) for non-breaking space or any other Unicode character.

Why does my XML break on a "smart quote"?

Smart quotes are non-ASCII. Either declare an encoding in the XML prolog (&lt;?xml encoding="UTF-8"?&gt;) and emit them directly, or convert to numeric refs.

How is XML encoding different from URL encoding?

XML encoding produces named entities (&); URL encoding produces percent codes (%26). Different contexts, different tables.

Tips​

  • Always encode user-supplied content before embedding in XML β€” even seemingly safe strings can contain "&".
  • Use attribute-context encoding for attribute values, element-context for content between tags. Different rules.
  • CDATA wrapping is a clean alternative when content has many special chars β€” the wrapper escapes the lot at once.
  • When debugging mojibake in XML, dump the raw bytes β€” encoding declaration mismatches are the #1 cause.
  • Always validate output XML in a strict parser; a missed encoding is silent until validation fails.

Try it now​

The full xml-encoder runs in your browser at https://ztools.zaions.com/xml-encoder β€” no signup, no upload, no data leaves your device.

Open the tool β†—


Last updated: 2026-05-05 Β· Author: Ahsan Mahmood Β· Edit this page on GitHub