Binary to Text
Convert text to binary and binary back to text.
Reviewed by Aygul Dovletova · Last reviewed
Converting Between Raw Bits and Readable Text
- Pick a direction with the mode tabs: "Text → Binary" encodes your typed characters to
0/1digits; "Binary → Text" reverses the process. - Paste or type your input into the input area. For text mode, any Unicode is accepted (emoji, CJK, RTL). For binary mode, paste a sequence of
0s and1s optionally separated by whitespace; other characters are stripped. - Toggle "Space-separated bytes" to insert (or require) a space between every byte-size chunk. This is how most academic and textbook examples are formatted (
01001000 01101001). - Toggle "8-bit padding" to pad each character\'s binary to exactly 8 bits. Turning it off produces variable-width output for ASCII-range characters (7 bits) which saves space but breaks multi-byte compatibility.
- Read the output, which updates live as you type. No button to click - the transformation runs on every keystroke via the
onInputhandler. Click the Copy button to drop the output into your clipboard.
Unicode Code Points, UTF-8 Bytes, and Why They Differ
The text-to-binary path calls codePointAt(0).toString(2) on each character. This is subtly different from UTF-8: it produces the raw Unicode scalar value, so "A" (U+0041) becomes 01000001 and emoji "😀" (U+1F600) becomes a 17-bit value. True UTF-8 would encode that emoji as 4 bytes. This tool\'s representation is closer to UCS-2 code units than wire-format UTF-8, fine for educational demos and matching most "binary to text" tools. For actual UTF-8 byte output (network protocol debugging), run through new TextEncoder().encode(str) first - the path the hashing and HMAC tools on this site use. Decoding reverses: split into chunks, parseInt(chunk, 2), String.fromCodePoint.
Why You Would Actually Need This
- Solving CTF (Capture The Flag) challenges where the flag is hidden as binary digits in an image or text file.
- Explaining to a student how ASCII or Unicode maps characters to numbers - toggling 8-bit padding shows the zero-padded byte boundaries that matter in memory layout.
- Debugging a protocol where bytes are logged as ones and zeros (rare in modern stacks but common in legacy telecom or RS-232 serial dumps).
- Preparing a riddle or puzzle message to share: encode a sentence, send the binary string, recipient decodes it with the same tool.
- Validating that a bit-manipulation function in your code produces the expected binary for a known input character.
- Converting a binary string from an embedded-systems datasheet example into human-readable text to verify you understood the encoding.
Pitfalls When Round-Tripping Through Binary
The top footgun is mismatched padding settings between encoding and decoding. If you encode "Hi" with 8-bit padding on, you get 0100100001101001; if you decode that same string with padding off (expecting 7-bit chunks) the tool splits it as 0100100 0011010 01 and produces nonsense. Both ends must agree on the chunk size. Second, non-BMP Unicode characters (emoji, CJK extensions) occupy two UTF-16 code units or a single code point above U+FFFF; this tool reads [...text] which iterates by code point so emoji round-trip correctly, but a hand-written loop using text[i] would split a surrogate pair and produce broken output. Third, the binary-to-text path silently drops non-01 characters via .replace(/[^01\\s]/g, ''), which is forgiving but means a typo like "01001oo01" gets cleaned to "0100101" without complaint. Fourth, a chunk that represents code point 0 (a null character) is dropped from the output to avoid polluting display with invisible nulls; this is an intentional departure from strict lossless round-trip.
The Broader Binary-to-Text Encoding Landscape
Pure binary is pedagogical but inefficient - 8 characters per byte. Real protocols use denser encodings. Base64 (RFC 4648 section 4) packs 3 bytes into 4 characters using A-Z, a-z, 0-9, +, /; url-safe Base64 (section 5) swaps + and / for - and _. Base32 (section 6) uses A-Z plus 2-7, case-insensitive and without 0/O/1/I confusion - used by TOTP secrets and BIP-39. Ascii85 (PDF streams) packs 4 bytes into 5 characters for 25 percent better density than Base64. Z85 is a URL-safe Ascii85 variant. Quoted-Printable (RFC 2045) encodes only non-ASCII as =XX hex, keeping most email text readable. Hexadecimal is base 16. Each trades density against alphabet safety.
Binary vs Base64, xxd, and System Tools
On Unix, echo "Hi" | xxd -b produces the binary representation of each byte (00101001 01001000 01101001...) and is the closest CLI equivalent. xxd without flags gives hex; base64 filename produces Base64; od -An -tb filename produces octal binary. All of those handle files arbitrarily large and stream from disk. This tool\'s advantage is interactive live preview and running in a browser tab. Its disadvantage is conceptual: the text-to-binary here maps to code points, not to wire-format UTF-8 bytes, so a byte-accurate debug dump should use xxd -b or a similar native tool. For true UTF-8 bytes, use the hash generator tools on this site (which encode text to UTF-8 before hashing) or use new TextEncoder().encode(str) in Node\'s REPL. The dedicated Base64, Base32, and URL encoder tools on this site cover the other entries in the encoding landscape.
Frequently Asked Questions
Does the tool produce true UTF-8 binary or Unicode code-point binary?
Unicode code-point binary, not UTF-8 wire format. The letter "A" (U+0041) produces 01000001 in both representations by coincidence since ASCII is a subset of Unicode. For characters above U+007F they diverge: U+00E9 ("e with acute") is code-point 11101001 (8 bits) in this tool, but the UTF-8 byte sequence is 11000011 10101001 (16 bits). For byte-accurate network protocol work, use a hex or UTF-8-specific tool; this tool targets the educational case of showing how characters map to numeric code points in binary.
Is my text stored or sent anywhere?
No. The tool runs in the browser, using the JavaScript String methods and a simple conversion routine. There is no fetch call, no WebSocket, no analytics event containing your text, and no localStorage persistence. You can type a password or private document into the input box and watch in DevTools Network tab - zero requests fire. The tab closing erases the state.
Why does my decoded output differ from what I encoded?
Mismatched settings between encode and decode. The most common cause is the 8-bit padding toggle: if you encode without padding, ASCII characters become 7-bit chunks (1000001 for A), but decoding with 8-bit padding on splits the stream into 8-bit groups and produces garbage. Set padding and space-separation the same way on both sides. Second cause is pasting binary that was produced by a different tool that uses UTF-8 wire format instead of code points - the chunk boundaries will not align.
Can I decode binary that has extra spaces or line breaks?
Yes. The binary-to-text function strips all non-digit whitespace characters with a regex, then splits on whitespace if space-separation is enabled. Newlines, tabs, and multiple spaces between groups are handled. Non-binary characters (letters, punctuation) are silently dropped, which is convenient but can hide typos - if you paste "10O01" meaning one-zero-zero-zero-one, the tool cleans it to "1001" which is a different value.
How does this handle emoji and CJK characters?
Emoji in the Basic Multilingual Plane have code points up to U+FFFF; emoji above that (most modern faces, flags) go up to U+10FFFF and produce up to 21 bits. CJK characters in the BMP are 15-16 bits each. The text is iterated with [...text] which respects surrogate pairs, so round-trip encoding preserves all these characters. For UTF-8 byte-level analysis, use a hex-dump tool instead.
Why does the null byte get dropped during decoding?
A chunk of all zeros represents code point 0 (U+0000 NULL), which renders invisibly and often breaks terminals and HTML. The decoder checks for zero and emits nothing to avoid confusing output. If you decode a binary stream that legitimately contains null bytes (embedded-systems datasheets, serialized C strings), this tool silently loses them; use a hex-oriented tool for that workload.
Does this tool handle Base64 or Base32?
No - those are different encodings covered by dedicated tools on this site. Base64 (RFC 4648 section 4) maps 3 bytes to 4 characters using A-Z, a-z, 0-9, +, /; Base32 (RFC 4648 section 6) maps 5 bytes to 8 characters using A-Z and 2-7. Both are denser than raw binary. Use the Base64 Encoder/Decoder and Base32 tools on this site for those; this page is for converting between text and literal binary digits.
Can I paste binary from a CTF challenge and get the flag?
Often yes. CTF flag encoding with space-separated 8-bit binary is one of the most common beginner puzzles, and this tool decodes them directly. Set the mode to "Binary to Text", enable 8-bit padding and space-separation as appropriate, paste the binary, and read the decoded text. If the flag is UTF-8 encoded (non-ASCII characters), the decoded output may look garbled because the tool decodes code points, not UTF-8 bytes; try splitting the binary into 8-bit groups and running through a UTF-8-aware decoder in that case.
More Text Tools
Case Converter
Convert text between UPPER, lower, Title, Sentence, camelCase, snake_case and more.
Open toolCharacter Counter
Count characters with platform-specific limits for Twitter, Instagram and more.
Open toolEmoji Picker & Search
Search and copy emojis by name or category.
Open toolFancy Text Generator
Generate stylish text with bubbles, squares, upside down and more for social media.
Open toolFind & Replace
Find and replace text with regex support and case-sensitive options.
Open toolHTML to Markdown
Convert HTML to clean Markdown text.
Open tool