ToolJutsu
All tools
Text Tools

Unicode to Text

Decode Unicode escape sequences back into text.

Enter escape sequences above to see the decoded text here.

Supported escape formats

  • \uXXXX — UTF-16 escape (4 hex digits)
  • \u{XXXXX} — ES6 code-point escape
  • \xXX — Latin-1 / byte escape (2 hex digits)
  • &#NNN; / &#xHHH; — HTML numeric entities

Unrecognised sequences are passed through unchanged.

Processed on your device. We never see your files.

How to use Unicode to Text

What this tool does

Unicode to Text takes a string that contains Unicode escape sequences and converts it back into plain, readable text. You paste the escaped string — as copied from a JavaScript source file, a JSON payload, a CSS stylesheet, or any other context where characters appear as backslash codes — and the decoded text appears in real time in the output box below.

The tool understands four common escape formats. The \uXXXX form is the classic four-hex-digit escape used in JavaScript string literals and JSON. The \u{XXXXX} form is the ES6 code-point escape that supports characters above U+FFFF in a single token. The \xXX form is the two-hex-digit byte escape used in older JavaScript and in many other languages like Python (for Latin-1 values). HTML numeric entities, both decimal (&#NNN;) and hexadecimal (&#xHHH;), are also decoded, which covers a wide range of web-related content you might need to read. Sequences that do not match any of these patterns pass through untouched, so mixed input works naturally.

Why you might need it

Escaped strings turn up in logs, database dumps, API responses, and minified code. Reading Hello directly takes effort; seeing Hello is instant. Developers debugging an API that encodes non-ASCII characters as escapes, testers inspecting a minified JavaScript bundle, or anyone copying text from a JSON file that escaped all Unicode — all of these situations call for a quick decoder.

Web developers also encounter HTML numeric entities when scraping or pre-processing HTML. A content block might contain “ and ” for curly quotes, or © for the copyright symbol, and decoding them in a spreadsheet or text editor is tedious. Pasting the block here reveals the original characters immediately.

The reverse tool, Text to Unicode Escape, is linked in the related tools section if you need to go in the other direction.

How to use it

  1. Paste your escaped string into the Unicode escape sequences box. The font is monospace to make the escape patterns easier to read.
  2. The decoded text appears in the Decoded text box in real time.
  3. If the input contains something malformed — a \u{ without a closing brace, or a code point beyond U+10FFFF — an error message appears describing the problem.
  4. Click Copy decoded text to copy the result to your clipboard.
  5. Use Load sample to see a pre-built example that mixes \uXXXX and \u{XXXX} escapes with accented characters.
  6. Click Clear to reset both boxes.

Common pitfalls

Double-escaping is the most frequent source of confusion. In a JSON file viewed as raw text, a Unicode escape looks like \\u0048 because the backslash itself is escaped. If you copy that raw JSON text rather than the parsed value, you will have a literal backslash followed by a u, not the \u escape the decoder expects. In that case, the decoder sees plain text \\u0048 and leaves it as-is. The fix is to parse or unescape one level first, or to remove the extra backslash manually.

Surrogate pairs in \uXXXX form are decoded automatically when the tool sees a high surrogate (\uD800\uDBFF) immediately followed by a low surrogate (\uDC00\uDFFF). The pair is combined into the single astral code point it represents. A lone surrogate — one without a matching partner — is decoded to the corresponding code unit, which may display as a replacement character depending on your font and system.

Named HTML entities such as & or   are not handled — only numeric references (& or &). This is by design: named entity resolution requires a full mapping table and is outside the scope of a Unicode escape decoder.

Tips and advanced use

If you have a large blob of escaped text — an entire minified JavaScript file or a bulk JSON export — paste the whole thing. The decoder processes it in one pass and only touches the escape sequences, leaving the rest of the content intact. This makes it safe to use on files where only some strings are escaped and the surrounding code or markup should not be touched.

For decoding a mix of \xXX Latin-1 escapes, keep in mind that \xXX maps to a single byte, not necessarily a Unicode code point. Values \x00\x7F are identical to their ASCII equivalents, and \x80\xFF map to the same code points as Latin-1. If your source is a multi-byte UTF-8 sequence expressed as a chain of \xXX escapes (common in Python’s repr() output for non-ASCII strings), this tool will not automatically reassemble the bytes into UTF-8; it will decode each byte individually. Use a purpose-built UTF-8 byte-sequence decoder for that case.

Frequently asked questions

Does the tool send my escape sequences anywhere to decode them?
No. All decoding is done by JavaScript running locally in your browser. Nothing you paste into this tool is transmitted to any server, stored, or logged. If you are decoding escape sequences from a confidential codebase or a sensitive document, your data stays entirely on your machine.
What escape formats does this tool understand?
The tool recognises four forms: \uXXXX (classic UTF-16, four hex digits), \u{XXXXX} (ES6 code-point, one to six hex digits), \xXX (Latin-1 byte escape, two hex digits), and HTML numeric entities in both decimal (&#NNN;) and hexadecimal (&#xHHH;) form. Characters and sequences that do not match any of these formats are left unchanged.
What happens if I mix formats in the same input?
That is fine. The decoder processes the entire input in one pass and handles each escape form wherever it finds it. You can paste a string that contains \uXXXX sequences alongside &#x entities and literal text, and each recognised escape will be decoded while everything else is kept as-is.
The output looks correct but has extra characters — why?
If you decoded a string that contained both escape sequences and literal Unicode characters, both are present in the output. Also check whether your source had double-escaped sequences: \uXXXX (four backslashes in the raw data) means the first pass only produces a literal \uXXXX string, which would need a second decode. Paste the result back in and decode again if that is the case.
Can I use this to decode HTML entities in a web page?
Yes, for numeric entities (< or <). Named entities like &, <, or © are not decoded by this tool — it only handles numeric references. For a full HTML entity decoder, you would need a dedicated HTML parser.

Related tools