Back to Tools
Guest: 5 / 5 uses left today Sign In
Scraping Toolkit · Text Encoding and Decoding

Text Codec

All-in-one encoder & decoder: Base64, Hex, Unicode escapes, HTML entities, GBK garbled-text repair, and MD5/SHA256 hash.

Input
Output

Quick Examples

Click to load examples for different encoding formats

About Text Codec Tool

The Text Codec Tool integrates six commonly used functions: Base64 encode/decode, Hex (Base16) encode/decode, Unicode escape/unescape, HTML entity conversion, GBK garbled-text repair, and MD5/SHA-256 hash computation. All operations run locally in your browser — no data is ever uploaded.

Bidirectional operations are supported. Base64, Hex, Unicode, and HTML modes allow both encoding and decoding; GBK mode focuses on garbled-text repair (Chinese mistakenly decoded as Latin-1) — its "Encode" direction previews the garbled effect for verification. Hash mode computes both MD5 and SHA-256 digests simultaneously.

How to Use

  • Select the codec mode (Base64 / Hex / Unicode / HTML / GBK / Hash)
  • Paste the text into the input area (GBK mode accepts garbled strings; Hex mode auto-strips spaces, 0x prefix, and dashes)
  • Click the action button (Encode / Decode / Compute Hash) and view the result

Typical Use Cases

Repairing garbled Chinese in scraped content: When scraping legacy GBK-encoded Chinese sites, the default UTF-8 decoding of response.text produces garbled text like "ÄãºÃ". Paste the garbled string into GBK mode to recover the original Chinese — no need to rewrite the scraper or switch to a Python environment with charset detection.
Decoding hex in headers and bodies: When debugging APIs, encrypted headers, signature values, and device fingerprints often appear as hex (e.g. X-Sign: a1b2c3d4...). Paste them into Hex mode to recover readable bytes for further analysis with Base64 or Hash modes.
XSS vulnerability verification: Security engineers use the HTML entity mode to quickly verify whether input points properly escape special characters — enter <img src=x onerror=alert(1)> and observe the encoded output to confirm defensive measures.
Base64 decoding in log analysis: When investigating production logs containing Base64-encoded error messages or context data, paste the string into the tool for instant decoding — no more echo "xxx" | base64 -d in the terminal.
File integrity verification: After downloading images, dependency packages, or binaries, compute SHA-256 in Hash mode and compare against the official checksum to confirm the file was not tampered with during transfer.

Common Mistakes & Best Practices

  • Treating Base64 as encryption: Base64 is an encoding algorithm, not encryption. Anyone can decode a Base64 string. Never use Base64-encoded data for transmitting sensitive information (passwords, keys, tokens) — always layer real encryption (HTTPS/TLS) on top.
  • Hex case mismatch causing comparison failures: Hex itself is case-insensitive (48656c6c6f equals 48656C6C6F), but some protocols enforce one case — MAC addresses and TLS certificate fingerprints are usually uppercase, while signature digests and API hashes are usually lowercase. This tool outputs lowercase by default and accepts either case on decode, but always toLowerCase() / toUpperCase() before comparing hex strings across systems.
  • GBK repair fails on non-single-byte input: If the garbled string contains emoji, supplementary-plane characters, or anything with a codepoint ≥ 256, the byte information is already lost (most likely replaced by U+FFFD during a lossy UTF-8 decode upstream). The tool errors out instead of guessing. Repair only works for the specific case where GBK bytes were one-to-one mapped to characters via Latin-1.
  • Incomplete HTML entity encoding: Only escaping < and & is insufficient. Unescaped double quotes (") in attribute values break HTML structure; unescaped single quotes (') in JavaScript contexts can lead to XSS injection. Escape all five characters — & < > " ' — and use a Content-Security-Policy header for defense-in-depth.
  • Using hashes as reversible encoding: MD5 / SHA-256 are one-way — output cannot be "decrypted" back to input. For reversible string representation, use Base64 or Hex; for protecting sensitive data, use real symmetric (AES) or asymmetric (RSA / ECC) encryption. Hashes are only for integrity checking, password storage (with salt), and uniqueness fingerprints.

FAQ