BPE Tokenizer Visualizer

See Byte Pair Encoding happen step by step. Type any text and watch character pairs merge, building the vocabulary in real time.

Week 1 · Day 1 Week 6 · Day 3

Input

Text to tokenize

Merge Steps

0 = character-level max = fully merged

Tokenized Output

Enter text above to see tokenization

Tokens

Characters

Vocabulary Size

O(n²) Attention Cost

Sequence length impact Transformer attention cost scales as O(n²) where n is the sequence length in tokens. A sequence of 43 tokens costs ~1,849 operations per head.

Character-Level vs BPE

Character-Level (no merges)

Sequence length: 0 · O(n²) = 0

BPE (current merges)

Sequence length: 0 · O(n²) = 0

Merge Table

Pairs are counted and merged in order of frequency. Each merge adds a new token to the vocabulary.

#	Pair	New Token ID	Count