ZeroUtil

String Length Calculator

Calculate string length in characters, bytes (UTF-8/UTF-16) and graphemes.

0
.length
0
UTF-8 Bytes
0
UTF-16 Bytes
0
Graphemes
0
Code Points
0
Lines

How to Use the String Length Calculator

Paste or type any text into the input box above. The calculator instantly displays multiple string metrics that developers need when working with text encoding and storage.

Metrics Explained

  • .length โ€” JavaScript's string.length property. Counts UTF-16 code units, not visible characters. Surrogate pairs (emoji, CJK extensions) count as 2.
  • UTF-8 Bytes โ€” The number of bytes when the string is encoded as UTF-8. ASCII characters use 1 byte; most emoji use 4 bytes.
  • UTF-16 Bytes โ€” The number of bytes in UTF-16 encoding (2 bytes per code unit). Useful for estimating memory in Java, C# and JavaScript engines.
  • Graphemes โ€” The number of user-perceived characters. A family emoji like ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ is one grapheme but many code points.
  • Code Points โ€” Unicode code points via the spread operator [...str].length. Counts individual Unicode scalar values.
  • Lines โ€” Number of lines based on newline (\n) characters.

When .length Differs from Grapheme Count

A yellow note appears when these values diverge, alerting you to multi-code-unit characters. This is critical for form validation, database column limits, and API payload sizes.

Frequently Asked Questions

Why does JavaScript .length return 2 for some emoji?

JavaScript strings are UTF-16. Characters outside the Basic Multilingual Plane (above U+FFFF) are stored as surrogate pairs โ€” two code units โ€” so .length counts them as 2. For example, ๐Ÿ˜€ has .length of 2 but is a single grapheme.

What is the difference between UTF-8 and UTF-16 byte length?

UTF-8 uses 1โ€“4 bytes per character (ASCII = 1 byte, emoji = 4 bytes). UTF-16 uses 2 bytes for most characters and 4 bytes for supplementary characters (surrogate pairs). UTF-8 is more compact for ASCII-heavy text; UTF-16 is more compact for CJK text.

What is a grapheme cluster?

A grapheme cluster is the smallest user-perceived character unit. It can be a single code point (like "A") or multiple code points combined (like "รฉ" as e + combining accent, or a ZWJ emoji sequence). The grapheme count matches what users see on screen.

Why should I care about byte length?

Many systems have byte-level limits: database VARCHAR columns use byte length in UTF-8 (MySQL, PostgreSQL), HTTP headers are limited by byte size, and network payloads affect bandwidth. Knowing byte length prevents silent truncation and encoding errors.

Does my text get sent to a server?

No. All calculations run entirely in your browser using JavaScript. Your text never leaves your device โ€” nothing is uploaded or stored.

Ad

More Text Tools