Encoding Detective — Identify & Decode Unknown Text

Paste unknown encoded text and instantly identify its encoding — Base64, URL, HTML entities, Hex, JWT, Unicode, and more. Auto-decodes each detected format.

Your data never leaves your browser Available via MCP

Paste encoded text and click Detect to identify its encoding

How to Use Encoding Detective

  1. Paste your encoded or mystery text into the input area.
  2. Click Detect to run the analysis.
  3. Review the list of detected encodings, each showing a confidence level and the decoded output.
  4. Click Copy next to any decoded value to copy it to your clipboard.
  5. Use Load Example to see how the tool works with a sample Base64 string.

What Is Encoding Detection?

Encoding detection is the process of analyzing a string of text or data and determining what encoding scheme was used to produce it. When you encounter an unfamiliar string — perhaps in an API response, a database field, a URL, a log file, or an email header — you need to know its encoding before you can decode it back to its original form. Encoding Detective automates this by applying pattern recognition to identify the most likely encoding formats and immediately showing you the decoded result.

Unlike character encoding detection (UTF-8 vs. ISO-8859-1), this tool focuses on data encodings — schemes that transform readable text or binary data into a different textual representation for safe transport, storage, or embedding in specific contexts.

Common Encoding Formats Explained

Base64

Base64 converts binary data into a 64-character ASCII alphabet (A-Z, a-z, 0-9, +, /). It is ubiquitous in web development: embedding images as data URIs, encoding email attachments via MIME, transmitting binary data in JSON payloads, and storing cryptographic keys in PEM format. Base64 increases data size by roughly 33%. The URL-safe variant (Base64url) replaces + with - and / with _, commonly seen in JWT tokens and URL parameters. Learn more in our Base64 encoding guide. Encode and decode Base64 with the Base64 Encoder/Decoder.

URL Encoding (Percent-Encoding)

URL encoding replaces unsafe characters with % followed by two hexadecimal digits representing the character's byte value. Spaces become %20, ampersands become %26, and non-ASCII characters like é become multi-byte sequences like %C3%A9. This encoding is defined by RFC 3986 and is essential for constructing valid URLs and form submissions. Use the URL Encoder/Decoder to convert individual strings.

HTML Entities

HTML entities represent reserved and special characters in HTML documents. Named entities like <, &, and " are human-readable references, while numeric entities (<) and hexadecimal entities (<) use character code points. HTML encoding prevents cross-site scripting (XSS) by ensuring user input is rendered as text, not executable markup. Decode and encode these with the HTML Entity Encoder.

Hexadecimal Strings

Hexadecimal encoding represents each byte as two hex digits (0-9, a-f). You will encounter hex strings in cryptographic hashes (SHA-256 produces 64 hex characters), MAC addresses, color codes, binary file dumps, and debugging output. Hex strings with a 0x prefix are common in programming languages. Generate hashes in hex format with the Hash Generator.

JSON Web Tokens (JWT)

A JWT consists of three Base64url-encoded segments separated by dots: a header (specifying the algorithm), a payload (containing claims like user ID and expiration), and a signature. JWTs are the de facto standard for stateless authentication in web APIs. The header and payload are not encrypted — they are merely encoded, meaning anyone can read them. Use the JWT Decoder for detailed token analysis including timestamp formatting and expiration checking.

Unicode Escape Sequences

Unicode escapes represent characters using their code points. The \uXXXX format (four hex digits) covers the Basic Multilingual Plane and is standard in JavaScript, Java, and JSON. The \u{XXXXX} extended format handles supplementary characters like emoji and is supported in ES6+ JavaScript. These escapes appear frequently in JSON strings, source code, and internationalization files.

Other Encodings

Encoding Detective also identifies Punycode (the xn-- prefix used for internationalized domain names), binary strings (sequences of 0s and 1s representing ASCII bytes), octal escape sequences (base-8 character codes), and quoted-printable encoding (the =XX format used in email headers and MIME content). Each serves a specific purpose in networking, file systems, or messaging protocols.

When You Need Encoding Detection

Developers encounter mystery encoded strings constantly. Common scenarios include: debugging API responses that contain double-encoded values, inspecting database fields where encoded data was stored without documentation, analyzing URL parameters in web application logs, examining email headers with quoted-printable or Base64 content transfer encoding, reverse-engineering third-party integrations, and investigating security issues where input sanitization applied the wrong encoding.

Rather than manually guessing the encoding and trying decoders one by one, Encoding Detective tests all common formats simultaneously and presents results ranked by confidence. This saves time and eliminates the trial-and-error approach that often leads to subtle bugs when the wrong encoding is assumed.

Multi-Layer Encoding

In practice, data is often encoded multiple times. A common pattern is URL-encoding a Base64 string so it can be safely embedded in a query parameter: the = padding becomes %3D and + becomes %2B. To decode multi-layer strings, start by identifying and decoding the outermost encoding, then paste the result back into the tool. For automated multi-step decoding, use the Pipeline Builder to chain decoders in sequence.

Security Considerations

Encoding is not encryption. Base64, URL encoding, and HTML entities provide no security whatsoever — they are reversible transformations that anyone can decode. Never rely on encoding to protect sensitive data. For actual security, use proper encryption (AES, RSA) or cryptographic hashing (SHA-256). If you encounter encoded data that looks like credentials or tokens, treat it as if it were in plaintext and handle it accordingly.

All processing in this tool happens client-side in your browser. No data is transmitted to any server. This makes it safe to paste sensitive strings like JWT tokens or encoded credentials without risk of exposure.

Related Tools

Frequently Asked Questions

What encodings can this tool detect?
Encoding Detective identifies Base64 (standard and URL-safe), URL percent-encoding, HTML entities (named, numeric, and hex), hexadecimal strings, JSON Web Tokens (JWT), Unicode escape sequences (\uXXXX and \u{XXXXX}), Punycode (internationalized domain names), binary strings, octal character codes, and quoted-printable encoding.
How does encoding detection work?
The tool applies pattern-matching heuristics for each encoding format. It checks structural characteristics — like % followed by two hex digits for URL encoding, or three dot-separated Base64url segments for JWT — then attempts to decode the input and verifies the result is valid. A confidence score (high, medium, or low) reflects how closely the input matches each pattern.
Can the tool detect multiple encodings at once?
Yes. Some strings match multiple encoding patterns. For example, a valid Base64 string might also resemble a hex string. The tool reports all possible encodings ranked by confidence so you can decide which interpretation is correct.
What does the confidence level mean?
High confidence means the input strongly matches the encoding pattern and decodes to readable text. Medium confidence means the pattern matches but the result may be ambiguous. Low confidence means the input loosely matches — it could be the encoding but other explanations are equally likely.
Is my data safe when using this tool?
Yes. All detection and decoding runs entirely in your browser using JavaScript. No data is sent to any server. You can verify this by checking your browser's Network tab — no requests are made when you click Detect.
What if my string uses multiple layers of encoding?
Encoding Detective identifies the outermost encoding layer. If your string is double-encoded (for example, Base64-encoded URL-encoded text), decode the first layer, then paste the result back in to detect the next layer. For automated multi-step decoding, try the Pipeline Builder tool.
Why was no encoding detected for my input?
The tool requires the input to match known encoding patterns. Plain text, proprietary formats, or custom encryption will not be detected. Also, very short inputs (under 4 characters) may not have enough signal to reliably identify an encoding.

Use this tool from AI agents. The CodeTidy MCP Server lets Claude, Cursor, and other AI agents use this tool and 46 others directly. One command: npx @codetidy/mcp

Drop file to load