Tokenizer

Paste any text and watch it shatter into the tokens an LLM actually sees — color-coded, counted live, surprises and all.

0Tokens
0Characters
0.0Chars / token
Tokenized with GPT-4o-style (o200k_base) BPE — boundaries are real, not approximated.
Colors are stable per token id, so the same token keeps its color as you edit and identical tokens match. A dashed chunk is a partial-byte token — part of a multi-byte character (like an emoji) that only completes on the next token.